Table of Contents
Fetching ...

Algorithms for College Admissions Decision Support: Impacts of Policy Change and Inherent Variability

Jinsook Lee, Emma Harvey, Joyce Zhou, Nikhil Garg, Thorsten Joachims, Rene F. Kizilcec

TL;DR

The paper investigates how the end of race-conscious admissions (SFFA policy) and inherent randomness in ML-driven admissions ranking affect who is prioritized for review in selective colleges. Using four years of data from a selective, test-optional institution, the authors train a gradient-boosted ranking model and simulate policy changes that remove race data, then compare against baselines and a major-but-unchanging variable like intended major. They find that race-unaware rankings substantially reduce URM representation in the top review pool (about a 62% relative drop) without yielding a corresponding rise in academic merit, and that removing race data has a larger negative impact on diversity than excluding other informative variables. Moreover, individual applicant outcomes exhibit substantial arbitrariness due to model multiplicity and bootstrapping, with across-policy arbitrariness exceeding within-policy arbitrariness, especially for typically top-ranked applicants. The work highlights that the SFFA policy change is unlikely to resolve core tensions in selective admissions and calls for methods to preserve diversity and reduce arbitrariness without relying on race data in rankings.

Abstract

Each year, selective American colleges sort through tens of thousands of applications to identify a first-year class that displays both academic merit and diversity. In the 2023-2024 admissions cycle, these colleges faced unprecedented challenges. First, the number of applications has been steadily growing. Second, test-optional policies that have remained in place since the COVID-19 pandemic limit access to key information historically predictive of academic success. Most recently, longstanding debates over affirmative action culminated in the Supreme Court banning race-conscious admissions. Colleges have explored machine learning (ML) models to address the issues of scale and missing test scores, often via ranking algorithms intended to focus on 'top' applicants. However, the Court's ruling will force changes to these models, which were able to consider race as a factor in ranking. There is currently a poor understanding of how these mandated changes will shape applicant ranking algorithms, and, by extension, admitted classes. We seek to address this by quantifying the impact of different admission policies on the applications prioritized for review. We show that removing race data from a developed applicant ranking algorithm reduces the diversity of the top-ranked pool without meaningfully increasing the academic merit of that pool. We contextualize this impact by showing that excluding data on applicant race has a greater impact than excluding other potentially informative variables like intended majors. Finally, we measure the impact of policy change on individuals by comparing the arbitrariness in applicant rank attributable to policy change to the arbitrariness attributable to randomness. We find that any given policy has a high degree of arbitrariness and that removing race data from the ranking algorithm increases arbitrariness in outcomes for most applicants.

Algorithms for College Admissions Decision Support: Impacts of Policy Change and Inherent Variability

TL;DR

The paper investigates how the end of race-conscious admissions (SFFA policy) and inherent randomness in ML-driven admissions ranking affect who is prioritized for review in selective colleges. Using four years of data from a selective, test-optional institution, the authors train a gradient-boosted ranking model and simulate policy changes that remove race data, then compare against baselines and a major-but-unchanging variable like intended major. They find that race-unaware rankings substantially reduce URM representation in the top review pool (about a 62% relative drop) without yielding a corresponding rise in academic merit, and that removing race data has a larger negative impact on diversity than excluding other informative variables. Moreover, individual applicant outcomes exhibit substantial arbitrariness due to model multiplicity and bootstrapping, with across-policy arbitrariness exceeding within-policy arbitrariness, especially for typically top-ranked applicants. The work highlights that the SFFA policy change is unlikely to resolve core tensions in selective admissions and calls for methods to preserve diversity and reduce arbitrariness without relying on race data in rankings.

Abstract

Each year, selective American colleges sort through tens of thousands of applications to identify a first-year class that displays both academic merit and diversity. In the 2023-2024 admissions cycle, these colleges faced unprecedented challenges. First, the number of applications has been steadily growing. Second, test-optional policies that have remained in place since the COVID-19 pandemic limit access to key information historically predictive of academic success. Most recently, longstanding debates over affirmative action culminated in the Supreme Court banning race-conscious admissions. Colleges have explored machine learning (ML) models to address the issues of scale and missing test scores, often via ranking algorithms intended to focus on 'top' applicants. However, the Court's ruling will force changes to these models, which were able to consider race as a factor in ranking. There is currently a poor understanding of how these mandated changes will shape applicant ranking algorithms, and, by extension, admitted classes. We seek to address this by quantifying the impact of different admission policies on the applications prioritized for review. We show that removing race data from a developed applicant ranking algorithm reduces the diversity of the top-ranked pool without meaningfully increasing the academic merit of that pool. We contextualize this impact by showing that excluding data on applicant race has a greater impact than excluding other potentially informative variables like intended majors. Finally, we measure the impact of policy change on individuals by comparing the arbitrariness in applicant rank attributable to policy change to the arbitrariness attributable to randomness. We find that any given policy has a high degree of arbitrariness and that removing race data from the ranking algorithm increases arbitrariness in outcomes for most applicants.
Paper Structure (43 sections, 2 equations, 8 figures, 2 tables)

This paper contains 43 sections, 2 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Impact of policy changes on the racial and ethnic diversity of the top-rated group of applicants. Statistically significant differences in proportion of URM applicants present in the top-rated group of applicants compared to the ML baseline model are denoted with an asterisk. In Graph (b), 95% confidence intervals for the ML models are shown based on results over 1,000 bootstraps.
  • Figure 2: Impact of policy changes on the socioeconomic diversity of the top-rated group of applicants. Statistically significant differences in proportion of first-gen and low SES applicants present in the top-rated group of applicants compared to the ML baseline model are denoted with an asterisk. 95% confidence intervals for the ML models are shown based on results over 1,000 bootstraps.
  • Figure 3: Impact of policy changes on the academic merit of the top-rated group of applicants. Statistically significant differences in standardized test percentile and share of applicants actually admitted or waitlisted of the top-rated group of applicants compared with the ML baseline model are denoted with an asterisk. In Graph (a), 95% confidence intervals for the ML models are shown based on results over 1,000 bootstraps. In Graph (b), the darker blue line represents the mean standardized test percentile within the specified applicant pool.
  • Figure 4: Graph (a) shows the cumulative distribution (CDF) of self-consistency within 1,000 bootstraps of the ML baseline model for the applicant pool (blue line), only applicants who are usually top-ranked (ranked in the top by >50% of bootstrapped models, orange line), and only applicants who are usually not top-ranked (ranked in the top by <=50% of bootstrapped models, green line). The dashed black line corresponds to sc = 0.95. Graph (b) shows the level of arbitrariness if we define an applicant's outcomes to be consistent if their sc$\geq$ 0.95 (and their outcomes to be arbitrary if their sc$<$ 0.95): only 9% of applicants are consistently ranked in the top, 60% of applicants are consistently not ranked in the top, and 31% of applicants have arbitrary outcomes.
  • Figure 5: Graph (a) shows the CDF of self-consistency for all applicants within 1,000 bootstraps of the ML baseline model for the applicant pool (blue line), within 1,000 bootstraps of the 'no race' model (orange line), and across 500 bootstraps of each models (green line). Graph (b) shows CDFs only for those applicants who are usually top-ranked (ranked in the top by >50% of bootstrapped models). Graph (c) shows the level of arbitrariness in the ML baseline model compared to the 'No race' model, if we define an applicant’s outcomes to be consistent if and only if their sc$\geq$ 0.95: 9% of applicants are consistently ranked in the top under the ML baseline model, and 5% are consistently ranked in the top under the 'No race' policy.
  • ...and 3 more figures