Fairness May Backfire: When Leveling-Down Occurs in Fair Machine Learning

Yi Yang; Xiangyu Chang; Pei-yu Chen

Fairness May Backfire: When Leveling-Down Occurs in Fair Machine Learning

Yi Yang, Xiangyu Chang, Pei-yu Chen

TL;DR

This work analyzes two deployment regimes for ML classifiers under common legal and governance constraints: attribute-aware decision-making (sensitive attributes available at decision time) and attribute-blind decision-making (sensitive attributes excluded from prediction).

Abstract

As machine learning (ML) systems increasingly shape access to credit, jobs, and other opportunities, the fairness of algorithmic decisions has become a central concern. Yet it remains unclear when enforcing fairness constraints in these systems genuinely improves outcomes for affected groups or instead leads to "leveling down," making one or both groups worse off. We address this question in a unified, population-level (Bayes) framework for binary classification under prevalent group fairness notions. Our Bayes approach is distribution-free and algorithm-agnostic, isolating the intrinsic effect of fairness requirements from finite-sample noise and from training and intervention specifics. We analyze two deployment regimes for ML classifiers under common legal and governance constraints: attribute-aware decision-making (sensitive attributes available at decision time) and attribute-blind decision-making (sensitive attributes excluded from prediction). We show that, in the attribute-aware regime, fair ML necessarily (weakly) improves outcomes for the disadvantaged group and (weakly) worsens outcomes for the advantaged group. In contrast, in the attribute-blind regime, the impact of fairness is distribution-dependent: fairness can benefit or harm either group and may shift both groups' outcomes in the same direction, leading to either leveling up or leveling down. We characterize the conditions under which these patterns arise and highlight the role of "masked" candidates in driving them. Overall, our results provide structural guidance on when pursuing algorithmic fairness is likely to improve group outcomes and when it risks systemic leveling down, informing fair ML design and deployment choices.

Fairness May Backfire: When Leveling-Down Occurs in Fair Machine Learning

TL;DR

Abstract

Paper Structure (18 sections, 10 theorems, 5 equations, 3 figures)

This paper contains 18 sections, 10 theorems, 5 equations, 3 figures.

Introduction
Related Work
Preliminaries
Classification
Fairness-Aware Classification
Fairness Constraints
Byes-Optimal Fair Classifier
Fair ML Impacts in the Attribute-Aware Regime
Impact on Different Groups
Redistribution Mechanism: The Role of the Sensitive Attribute
Advantaged group (deletion only).
Disadvantaged group (inclusion only).
Fair ML Impacts in the Attribute-Blind Regime
Impact on Different Groups
Redistribution Mechanism: The Role of Masked Candidates
...and 3 more sections

Key Result

Lemma 1

For a cost parameter $c \in [0, 1]$, all Bayes-optimal classifiers $f^{*}(v) \in \mathop{\rm arg\,min}_{f\in\mathcal{F}} R_{cs}(f ; c)$ have the form $f^{*}(v)=\mathbf{1}\left[h(v)>0\right]+\alpha\cdot\mathbf{1}\left[h(v)=0\right],$ for all $v\in\mathcal{V}$, where $h(v)=\eta(v)-c$, and $\alpha \in

Figures (3)

Figure 1: Outcomes under classical and fairness-aware ML.
Figure 2: Selection Regions Before and After Enforcing Fairness in the Attribute-Aware Regime.
Figure 3: Selection Regions Before and After Enforcing Fairness in the Attribute-Blind Regime.

Theorems & Definitions (19)

Definition 1: Randomized Classifier
Definition 2
Lemma 1: Bayes-optimal Classifiers
Definition 3: Demographic Parity
Definition 4: Equal Opportunity
Definition 5: Predictive Equality
Definition 6: Disparity Measures
Remark 1
Lemma 2: Bayes-optimal Fair Classifiers
Proposition 1: Opposite Threshold Shifts Across Groups
...and 9 more

Fairness May Backfire: When Leveling-Down Occurs in Fair Machine Learning

TL;DR

Abstract

Fairness May Backfire: When Leveling-Down Occurs in Fair Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (19)