Table of Contents
Fetching ...

Minimax Optimal Fair Classification with Bounded Demographic Disparity

Xianli Zeng, Guang Cheng, Edgar Dobriban

TL;DR

This work develops a minimax theory for fair binary classification under a finite-sample constraint by bounding demographic disparity $| ext{DDP}(f)|\le \delta$ and introducing the fairness-aware excess risk $d_E$. It shows that finite data introduce an additional error term arising from estimating group-wise thresholds, yielding a two-term minimax lower bound. The authors propose FairBayes-DDP+, a two-stage plug-in classifier with offsets that achieves the minimax rate and asymptotic fairness, handling possible jump discontinuities in disparity and boundary mass. Empirical studies on the Adult dataset and simulations demonstrate strong disparity control at user-specified levels while maintaining competitive accuracy and efficiency, highlighting practical applicability in fairness-sensitive domains.

Abstract

Mitigating the disparate impact of statistical machine learning methods is crucial for ensuring fairness. While extensive research aims to reduce disparity, the effect of using a \emph{finite dataset} -- as opposed to the entire population -- remains unclear. This paper explores the statistical foundations of fair binary classification with two protected groups, focusing on controlling demographic disparity, defined as the difference in acceptance rates between the groups. Although fairness may come at the cost of accuracy even with infinite data, we show that using a finite sample incurs additional costs due to the need to estimate group-specific acceptance thresholds. We study the minimax optimal classification error while constraining demographic disparity to a user-specified threshold. To quantify the impact of fairness constraints, we introduce a novel measure called \emph{fairness-aware excess risk} and derive a minimax lower bound on this measure that all classifiers must satisfy. Furthermore, we propose FairBayes-DDP+, a group-wise thresholding method with an offset that we show attains the minimax lower bound. Our lower bound proofs involve several innovations. Experiments support that FairBayes-DDP+ controls disparity at the user-specified level, while being faster and having a more favorable fairness-accuracy tradeoff than several baselines.

Minimax Optimal Fair Classification with Bounded Demographic Disparity

TL;DR

This work develops a minimax theory for fair binary classification under a finite-sample constraint by bounding demographic disparity and introducing the fairness-aware excess risk . It shows that finite data introduce an additional error term arising from estimating group-wise thresholds, yielding a two-term minimax lower bound. The authors propose FairBayes-DDP+, a two-stage plug-in classifier with offsets that achieves the minimax rate and asymptotic fairness, handling possible jump discontinuities in disparity and boundary mass. Empirical studies on the Adult dataset and simulations demonstrate strong disparity control at user-specified levels while maintaining competitive accuracy and efficiency, highlighting practical applicability in fairness-sensitive domains.

Abstract

Mitigating the disparate impact of statistical machine learning methods is crucial for ensuring fairness. While extensive research aims to reduce disparity, the effect of using a \emph{finite dataset} -- as opposed to the entire population -- remains unclear. This paper explores the statistical foundations of fair binary classification with two protected groups, focusing on controlling demographic disparity, defined as the difference in acceptance rates between the groups. Although fairness may come at the cost of accuracy even with infinite data, we show that using a finite sample incurs additional costs due to the need to estimate group-specific acceptance thresholds. We study the minimax optimal classification error while constraining demographic disparity to a user-specified threshold. To quantify the impact of fairness constraints, we introduce a novel measure called \emph{fairness-aware excess risk} and derive a minimax lower bound on this measure that all classifiers must satisfy. Furthermore, we propose FairBayes-DDP+, a group-wise thresholding method with an offset that we show attains the minimax lower bound. Our lower bound proofs involve several innovations. Experiments support that FairBayes-DDP+ controls disparity at the user-specified level, while being faster and having a more favorable fairness-accuracy tradeoff than several baselines.
Paper Structure (50 sections, 27 theorems, 204 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 50 sections, 27 theorems, 204 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Proposition 4.2

For any classifier ${f}\in\mathcal{F}$, the fairness-aware excess risk simplifies as follows, in the cases identified in Remark if: Moreover, for $\delta$-fair classifiers $f$ with $|\textup{DDP}(f)|\leqslant \delta$, we have $d_R \left( {f},f_\delta^\star\right)\geqslant d_E({f},f^\star_\delta)$.

Figures (6)

  • Figure 1: Our FairBayes-DDP+ method achieves a better fairness-accuracy tradeoff than other baselines on the "Adult" dataset. Here DDP is the demographic disparity, i.e., the difference in the probabilities of a positive classification among the two protected groups; see \ref{['sec:eda']} for details.
  • Figure 2: Estimation error of $t^\star_\delta$ in three cases, with $\mathbb{P}(A=1)=1/2$, $X|A=a\sim U(0,1)$ and $\delta = 0$. As we can see, when $D_-(t^\star_\delta) < \delta < D_+(t^\star_\delta)$, $t^\star_\delta$ can be estimated with a fast rate. When $\delta = D_-(t^\star_\delta)$ (or $\delta = D_+(t^\star_\delta)$), the convergence rate depends on the slope of $D_-$ (or $D_+$) near $t^\star_\delta$.
  • Figure 3: Estimated fairness-aware excess risk and DDP of our FairBayes-DDP+ classifier $\widehat{f}_{\delta,n}$ in the setting from \ref{['sec:simu-syn']}, for various sample sizes.
  • Figure 4: Estimated and population values of the fairness-aware excess risk and DDP of our FairBayes-DDP+ classifier $\widehat{f}_{\delta,n}$ in the setting from \ref{['sec:simu-syn']}, for various desired disparity levels.
  • Figure 5: When both $D_+(t)$ and $D_-(t)$ are flat at $\delta$ (or $-\delta$), there exists an interval $[t^\star_{\delta,\inf}, t^\star_{\delta,\sup}]$ within which the conditions $D_+(t) = D_-(t) = \delta$ (or $-\delta$) always hold. In other words, the corresponding classifiers satisfy the hard constraint. Here, we set $\mathbb{P}(A=1)=1/2$, $X|A=a \sim U(0,1)$ and $\delta = 0$.
  • ...and 1 more figures

Theorems & Definitions (40)

  • Remark 1: The impact of fairness on Bayes-optimal classifiers
  • Definition 4.1: Fairness-aware excess risk
  • Proposition 4.2: Characterizing fairness-aware excess risk
  • Definition 4.3: Hölder Smoothness
  • Definition 4.4: $\gamma$-Margin Condition, Adapted from TsybakovBayes2004, tsy2007, Was2013
  • Definition 4.5: Strong Density Condition tsy2007
  • Definition 4.6: Parameter space
  • Theorem 4.7: Minimax lower bound for fair classification
  • Definition 5.1
  • Proposition 5.2: Properties of a specific fair Bayes-optimal classifier
  • ...and 30 more