Table of Contents
Fetching ...

Affirmative Action vs. Affirmative Information

Claire Lazar Reich

TL;DR

It is proved that uncertainty has a disparate impact across demographic groups, and it is shown that additional data acquisition can eliminate the disparity and broaden access to opportunity.

Abstract

Critical decisions in hiring, college admissions, and credit lending are guided by predictions made in the presence of uncertainty. While uncertainty imparts errors across all demographic groups, this paper shows that the types of errors vary systematically: Groups with higher average outcomes are typically assigned higher false positive rates, while those with lower average outcomes are assigned higher false negative rates. We characterize the conditions that give rise to this disparate impact and explain why the intuitive remedy to omit demographic variables from datasets does not correct it. Instead of data omission, this paper examines how data acquisition can broaden access to opportunity. The strategy, which we call "Affirmative Information," could stand as an alternative to Affirmative Action.

Affirmative Action vs. Affirmative Information

TL;DR

It is proved that uncertainty has a disparate impact across demographic groups, and it is shown that additional data acquisition can eliminate the disparity and broaden access to opportunity.

Abstract

Critical decisions in hiring, college admissions, and credit lending are guided by predictions made in the presence of uncertainty. While uncertainty imparts errors across all demographic groups, this paper shows that the types of errors vary systematically: Groups with higher average outcomes are typically assigned higher false positive rates, while those with lower average outcomes are assigned higher false negative rates. We characterize the conditions that give rise to this disparate impact and explain why the intuitive remedy to omit demographic variables from datasets does not correct it. Instead of data omission, this paper examines how data acquisition can broaden access to opportunity. The strategy, which we call "Affirmative Information," could stand as an alternative to Affirmative Action.

Paper Structure

This paper contains 12 sections, 6 theorems, 28 equations, 4 figures.

Key Result

Proposition 1

Consider the distribution of the best predictions $\mathbb{E}[A|X]$ for group $G=g$. Its variance is smaller than the variance of ability for $g$ and its mean depends on the joint distribution of $(A,X,G)$. If $X$ fully explains the relationship between $A$ and $G$, so that $\mathbb{E} [A|X ] = \mat

Figures (4)

  • Figure 1: (a) Members of the higher-mean group are more likely to be classified as creditworthy. (b) Members of the higher-mean group are more likely to be classified as creditworthy, even when we condition on true ability. While the predictions are group-blind and there was no disparate treatment, there is disparate impact.
  • Figure 2: (a) Regression toward the mean centers predictions $\mathbb{E}[A|X]$ at the mean of ability $A$, with smaller variance. Separately considering predictions by group, however, reveals that the effect of regression toward the mean depends on the correlation between $(X,G)$. (b) If $X \perp G$, the predictions have the same distribution. (c) If $(X,G)$ are correlated, then the mean predictions will differ by group. (d) Furthermore, if $X$ actually explains the relationship between $(A,G)$, so that $\mathbb{E} [A|X] = \mathbb{E} [A|X,G]$, then predictions are centered at group-specific means. Note that the figure depicts normal distributions for illustration, but no normality assumption is required for any statistical result in Section \ref{['when-sec']}.
  • Figure 3: Plots in left column illustrate the case with equal variance signals, and plots in right column are based on higher-variance signals for the $L$ group. In (a), we see that expected signals conditional on true underlying ability are systematically higher for $H$ members. In (b) and (c), we plot the distribution of signals for qualified applicants ($A>0$) in a lemon-dropping and cherry-picking market, respectively. The group-specific true positive rates are given by the portion under each curve that is shaded. Members of $H$ have higher true positive rates, and the difference is particularly stark in the cherry-picking example. The imbalance is alleviated when the $L$ signals become more informative, as seen in (d)-(f).
  • Figure 4: (a) Scores based on limited dataset are bunched at low values for creditworthy and defaulting applicants. (b) Scores based on enriched dataset, in Affirmative Information scheme, better distinguish creditworthy applicants. (c) When blind prediction (gray) is replaced by Affirmative Action (blue), the TPR of the higher-mean group falls and the TPR of the lower-mean group increases. Meanwhile, when blind prediction is replaced by Affirmative Information (pink), the TPR of the higher-mean group is roughly unchanged and the TPR of the lower-mean group increases. For all but the lowest values of $k$, Affirmative Information is associated with higher TPRs for both groups than Affirmative Action. (d) Affirmative Information corresponds to greater lender profits than both blind prediction and Affirmative Action. (e) Affirmative Action tends to reduce the total number of good loans administered compared to blind prediction, whereas Affirmative Information increases that number beyond the levels in both alternative schemes. (f) All three schemes yield similar numbers of bad loans administered.

Theorems & Definitions (11)

  • Proposition 1
  • proof
  • Definition 1
  • Lemma 1
  • Theorem 1
  • Lemma 2
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • ...and 1 more