Table of Contents
Fetching ...

Trading off performance and human oversight in algorithmic policy: evidence from Danish college admissions

Magnus Lindgaard Nielsen, Jonas Skjold Raaschou-Pedersen, Emil Chrisander, David Dreyer Lassen, Julien Grenet, Anna Rogers, Andreas Bjerre-Nielsen

TL;DR

The paper investigates whether pre-admission predictions of degree completion can improve centralized college admissions. It leverages a nation-wide Danish dataset (2006–2017) and compares transformer and LSTM sequential models against baseline tabular models and GPA/human-based rankings, across academic and richer input variants. Results show that all ML approaches outperform current GPA- and human-based criteria, with the transformer achieving the highest predictive accuracy in the main academic-input setting, while gains from more complex models over simple ones are modest. The study also analyzes fairness (ABROCA), program matching within fields, and the economic implications via Marginal Value of Public Funds (MVPF), finding potential welfare gains under policy contractions but highlighting regulatory and transparency challenges for high-stakes, algorithmic admissions.

Abstract

Student dropout is a significant concern for educational institutions due to its social and economic impact, driving the need for risk prediction systems to identify at-risk students before enrollment. We explore the accuracy of such systems in the context of higher education by predicting degree completion before admission, with potential applications for prioritizing admissions decisions. Using a large-scale dataset from Danish higher education admissions, we demonstrate that advanced sequential AI models offer more precise and fair predictions compared to current practices that rely on either high school grade point averages or human judgment. These models not only improve accuracy but also outperform simpler models, even when the simpler models use protected sociodemographic attributes. Importantly, our predictions reveal how certain student profiles are better matched with specific programs and fields, suggesting potential efficiency and welfare gains in public policy. We estimate that even the use of simple AI models to guide admissions decisions, particularly in response to a newly implemented nationwide policy reducing admissions by 10 percent, could yield significant economic benefits. However, this improvement would come at the cost of reduced human oversight and lower transparency. Our findings underscore both the potential and challenges of incorporating advanced AI into educational policymaking.

Trading off performance and human oversight in algorithmic policy: evidence from Danish college admissions

TL;DR

The paper investigates whether pre-admission predictions of degree completion can improve centralized college admissions. It leverages a nation-wide Danish dataset (2006–2017) and compares transformer and LSTM sequential models against baseline tabular models and GPA/human-based rankings, across academic and richer input variants. Results show that all ML approaches outperform current GPA- and human-based criteria, with the transformer achieving the highest predictive accuracy in the main academic-input setting, while gains from more complex models over simple ones are modest. The study also analyzes fairness (ABROCA), program matching within fields, and the economic implications via Marginal Value of Public Funds (MVPF), finding potential welfare gains under policy contractions but highlighting regulatory and transparency challenges for high-stakes, algorithmic admissions.

Abstract

Student dropout is a significant concern for educational institutions due to its social and economic impact, driving the need for risk prediction systems to identify at-risk students before enrollment. We explore the accuracy of such systems in the context of higher education by predicting degree completion before admission, with potential applications for prioritizing admissions decisions. Using a large-scale dataset from Danish higher education admissions, we demonstrate that advanced sequential AI models offer more precise and fair predictions compared to current practices that rely on either high school grade point averages or human judgment. These models not only improve accuracy but also outperform simpler models, even when the simpler models use protected sociodemographic attributes. Importantly, our predictions reveal how certain student profiles are better matched with specific programs and fields, suggesting potential efficiency and welfare gains in public policy. We estimate that even the use of simple AI models to guide admissions decisions, particularly in response to a newly implemented nationwide policy reducing admissions by 10 percent, could yield significant economic benefits. However, this improvement would come at the cost of reduced human oversight and lower transparency. Our findings underscore both the potential and challenges of incorporating advanced AI into educational policymaking.

Paper Structure

This paper contains 33 sections, 16 equations, 16 figures, 13 tables.

Figures (16)

  • Figure 1: Admission to higher education and sequence representation. Panel A illustrates the general process for admission to higher education in Denmark, depicting two systems for ranking applicants: a mandatory GPA ranking, where all students are ranked by their high school GPA, and a merit-based human ranking, where students can opt-in voluntarily. Both rankings are used as inputs in a variant of the deferred acceptance mechanism bjerre-nielsen_voluntary_2022. Student receives a single enrollment offer from the highest-ranked institution where they qualify. Panel B displays the data observed after student enrollment, showing how information is stored in tabular form. Panel C shows the creation of 10 sequences based on the tabular representations from panel B, where the different colors indicate the source of the information in the tabular representation. These sequences begin with time-invariant events (e.g., socio-demographic information) followed by chronologically ordered events (e.g., grades). Each sequence encapsulates information regarding a specific aspect of events, represented by a column in the tabular representation, with the $i$th position in each sequence describing the $i$th event, regardless of its relevance.
  • Figure 2: Predictive performance of models and admission criteria. Panel A shows the model performance measured by AUC scores for different input types (in rows) and models (in columns). Panels B and C display the actual student completion rates as a function of predicted completion rates. The rates are binned by percent deciles, and performance is shown by model rank (solid lines) or observed admission rank (dashed lines). The performance scores are split by admission method: Panel B for GPA-based admissions and Panel C for human assessment. Students are perfectly ranked if completion rates start at 0% and abruptly jump to 100%. Horizontal dotted lines indicate the mean completion rate within each sample.
  • Figure 3: Fairness of admission criteria and algorithms. Measures of fairness for different models and inputs based on the ABROCA metric, weighted across institutions. Higher ABROCA values indicate more unequal performance, with 0 corresponding to no performance difference between groups. The fairness scores are divided by admission method: Panel A for GPA-based admissions and Panel B for human assessment. The columns display the fairness measure by sensitive attributes: whether a student is a Danish native (Native), sex (Female), and whether the student is above or below median socioeconomic status (SES).
  • Figure 4: Value of adopting prediction-based admission for different scenarios The figure shows the net present value for various cost and development time scenarios. The two panels depict two different time horizons for implementing the prediction-based admission policies, corresponding to similar delays in revenue generation. We use the lowest revenue estimate if algorithmic screening were adopted, which is based on a logistic regression. The dotted line indicates a revenue of 0, and the black marker indicates our estimated cost.
  • Figure 5: Sequential model architectures with aggregate embeddings Panel A illustrates the pre-norm encoder-only transformer architecture, while Panel B presents the LSTM architecture. Panel C provides an example of input embedding and summation, corresponding to the red elements shown in Panels A and B. For each event, values are first embedded into vectors, and the aggregate embedding of each event is represented as a summation of these vectors. This process matches the input embedding and summation shown at the bottom of Panel A.
  • ...and 11 more figures