Table of Contents
Fetching ...

Fix Representation (Optimally) Before Fairness: Finite-Sample Shrinkage Population Correction and the True Price of Fairness Under Subpopulation Shift

Amir Asiaee, Kaveh Aryan

TL;DR

This work addresses the mismeasurement of fairness-utility tradeoffs under subpopulation shift, showing that full importance weighting is asymptotically optimal but finite-sample suboptimal. It introduces a shrinkage-optimal population correction that blends the target and training mixtures via $ ext{lambda}^* = (1- au) ext{pi}^{ ext{tgt}} + au ext{hat(pi)}^{ ext{tr}}$, where $ au = b/(a+b)$ balances bias and variance, with $a = C_{ ext{bias}}$ and $b = C_{ ext{var}}(1/n_0 + 1/n_1)$. The paper also provides a deconfounding criterion and an evaluation protocol that fixes representation optimally before applying fairness, enabling fair comparisons against a shrinkage-corrected baseline. Empirical validation on synthetic data and real benchmarks (Adult, COMPAS) confirms the theory and shows that the protocol eliminates spurious tradeoffs, revealing the true fairness-utility frontier. Overall, the approach offers a principled way to quantify the irreducible cost of fairness and to benchmark fairness methods without bias from population mismatch.

Abstract

Machine learning practitioners frequently observe tension between predictive accuracy and group fairness constraints -- yet sometimes fairness interventions appear to improve accuracy. We show that both phenomena can be artifacts of training data that misrepresents subgroup proportions. Under subpopulation shift (stable within-group distributions, shifted group proportions), we establish: (i) full importance-weighted correction is asymptotically unbiased but finite-sample suboptimal; (ii) the optimal finite-sample correction is a shrinkage reweighting that interpolates between target and training mixtures; (iii) apparent "fairness helps accuracy" can arise from comparing fairness methods to an improperly-weighted baseline. We provide an actionable evaluation protocol: fix representation (optimally) before fairness -- compare fairness interventions against a shrinkage-corrected baseline to isolate the true, irreducible price of fairness. Experiments on synthetic and real-world benchmarks (Adult, COMPAS) validate our theoretical predictions and demonstrate that this protocol eliminates spurious tradeoffs, revealing the genuine fairness-utility frontier.

Fix Representation (Optimally) Before Fairness: Finite-Sample Shrinkage Population Correction and the True Price of Fairness Under Subpopulation Shift

TL;DR

This work addresses the mismeasurement of fairness-utility tradeoffs under subpopulation shift, showing that full importance weighting is asymptotically optimal but finite-sample suboptimal. It introduces a shrinkage-optimal population correction that blends the target and training mixtures via , where balances bias and variance, with and . The paper also provides a deconfounding criterion and an evaluation protocol that fixes representation optimally before applying fairness, enabling fair comparisons against a shrinkage-corrected baseline. Empirical validation on synthetic data and real benchmarks (Adult, COMPAS) confirms the theory and shows that the protocol eliminates spurious tradeoffs, revealing the true fairness-utility frontier. Overall, the approach offers a principled way to quantify the irreducible cost of fairness and to benchmark fairness methods without bias from population mismatch.

Abstract

Machine learning practitioners frequently observe tension between predictive accuracy and group fairness constraints -- yet sometimes fairness interventions appear to improve accuracy. We show that both phenomena can be artifacts of training data that misrepresents subgroup proportions. Under subpopulation shift (stable within-group distributions, shifted group proportions), we establish: (i) full importance-weighted correction is asymptotically unbiased but finite-sample suboptimal; (ii) the optimal finite-sample correction is a shrinkage reweighting that interpolates between target and training mixtures; (iii) apparent "fairness helps accuracy" can arise from comparing fairness methods to an improperly-weighted baseline. We provide an actionable evaluation protocol: fix representation (optimally) before fairness -- compare fairness interventions against a shrinkage-corrected baseline to isolate the true, irreducible price of fairness. Experiments on synthetic and real-world benchmarks (Adult, COMPAS) validate our theoretical predictions and demonstrate that this protocol eliminates spurious tradeoffs, revealing the genuine fairness-utility frontier.
Paper Structure (67 sections, 6 theorems, 28 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 67 sections, 6 theorems, 28 equations, 2 figures, 3 tables, 1 algorithm.

Key Result

Lemma 4.3

Under Assumptions ass:regularity--ass:stratified, the expected excess $\lambda$-mixture risk satisfies: where $f^*_\lambda = \arg\min_{f \in \mathcal{F}} [\mathcal{R}_\lambda(f) + r(f)]$.

Figures (2)

  • Figure 1: Target log loss vs. mixture parameter $\lambda$ on synthetic data with group-specific labeling ($\theta_0\neq\theta_1$). The minimum $\lambda^*$ (orange) lies between $\hat{\pi^{\mathrm{tr}}}$ (red) and $\pi^{\mathrm{tgt}}$ (green), confirming shrinkage-optimal correction.
  • Figure 2: Pareto frontiers on Adult. Both before (left) and after (right) correction show the expected tradeoff: higher fairness (lower DP gap) costs accuracy.

Theorems & Definitions (20)

  • Definition 3.2: Demographic parity violation
  • Definition 3.3: Price of fairness
  • Lemma 4.3: Estimation error bound
  • proof : Proof sketch
  • Lemma 4.4: Objective mismatch bound
  • proof : Proof sketch
  • Remark 4.5: When $G = 0$
  • Theorem 4.6: Bias-variance bound for target risk
  • proof
  • Corollary 4.7: Shrinkage optimizer
  • ...and 10 more