Table of Contents
Fetching ...

Fairness May Backfire: When Leveling-Down Occurs in Fair Machine Learning

Yi Yang, Xiangyu Chang, Pei-yu Chen

TL;DR

This work analyzes two deployment regimes for ML classifiers under common legal and governance constraints: attribute-aware decision-making (sensitive attributes available at decision time) and attribute-blind decision-making (sensitive attributes excluded from prediction).

Abstract

As machine learning (ML) systems increasingly shape access to credit, jobs, and other opportunities, the fairness of algorithmic decisions has become a central concern. Yet it remains unclear when enforcing fairness constraints in these systems genuinely improves outcomes for affected groups or instead leads to "leveling down," making one or both groups worse off. We address this question in a unified, population-level (Bayes) framework for binary classification under prevalent group fairness notions. Our Bayes approach is distribution-free and algorithm-agnostic, isolating the intrinsic effect of fairness requirements from finite-sample noise and from training and intervention specifics. We analyze two deployment regimes for ML classifiers under common legal and governance constraints: attribute-aware decision-making (sensitive attributes available at decision time) and attribute-blind decision-making (sensitive attributes excluded from prediction). We show that, in the attribute-aware regime, fair ML necessarily (weakly) improves outcomes for the disadvantaged group and (weakly) worsens outcomes for the advantaged group. In contrast, in the attribute-blind regime, the impact of fairness is distribution-dependent: fairness can benefit or harm either group and may shift both groups' outcomes in the same direction, leading to either leveling up or leveling down. We characterize the conditions under which these patterns arise and highlight the role of "masked" candidates in driving them. Overall, our results provide structural guidance on when pursuing algorithmic fairness is likely to improve group outcomes and when it risks systemic leveling down, informing fair ML design and deployment choices.

Fairness May Backfire: When Leveling-Down Occurs in Fair Machine Learning

TL;DR

This work analyzes two deployment regimes for ML classifiers under common legal and governance constraints: attribute-aware decision-making (sensitive attributes available at decision time) and attribute-blind decision-making (sensitive attributes excluded from prediction).

Abstract

As machine learning (ML) systems increasingly shape access to credit, jobs, and other opportunities, the fairness of algorithmic decisions has become a central concern. Yet it remains unclear when enforcing fairness constraints in these systems genuinely improves outcomes for affected groups or instead leads to "leveling down," making one or both groups worse off. We address this question in a unified, population-level (Bayes) framework for binary classification under prevalent group fairness notions. Our Bayes approach is distribution-free and algorithm-agnostic, isolating the intrinsic effect of fairness requirements from finite-sample noise and from training and intervention specifics. We analyze two deployment regimes for ML classifiers under common legal and governance constraints: attribute-aware decision-making (sensitive attributes available at decision time) and attribute-blind decision-making (sensitive attributes excluded from prediction). We show that, in the attribute-aware regime, fair ML necessarily (weakly) improves outcomes for the disadvantaged group and (weakly) worsens outcomes for the advantaged group. In contrast, in the attribute-blind regime, the impact of fairness is distribution-dependent: fairness can benefit or harm either group and may shift both groups' outcomes in the same direction, leading to either leveling up or leveling down. We characterize the conditions under which these patterns arise and highlight the role of "masked" candidates in driving them. Overall, our results provide structural guidance on when pursuing algorithmic fairness is likely to improve group outcomes and when it risks systemic leveling down, informing fair ML design and deployment choices.
Paper Structure (18 sections, 10 theorems, 5 equations, 3 figures)

This paper contains 18 sections, 10 theorems, 5 equations, 3 figures.

Key Result

Lemma 1

For a cost parameter $c \in [0, 1]$, all Bayes-optimal classifiers $f^{*}(v) \in \mathop{\rm arg\,min}_{f\in\mathcal{F}} R_{cs}(f ; c)$ have the form $f^{*}(v)=\mathbf{1}\left[h(v)>0\right]+\alpha\cdot\mathbf{1}\left[h(v)=0\right],$ for all $v\in\mathcal{V}$, where $h(v)=\eta(v)-c$, and $\alpha \in

Figures (3)

  • Figure 1: Outcomes under classical and fairness-aware ML.
  • Figure 2: Selection Regions Before and After Enforcing Fairness in the Attribute-Aware Regime.
  • Figure 3: Selection Regions Before and After Enforcing Fairness in the Attribute-Blind Regime.

Theorems & Definitions (19)

  • Definition 1: Randomized Classifier
  • Definition 2
  • Lemma 1: Bayes-optimal Classifiers
  • Definition 3: Demographic Parity
  • Definition 4: Equal Opportunity
  • Definition 5: Predictive Equality
  • Definition 6: Disparity Measures
  • Remark 1
  • Lemma 2: Bayes-optimal Fair Classifiers
  • Proposition 1: Opposite Threshold Shifts Across Groups
  • ...and 9 more