Table of Contents
Fetching ...

On the Inductive Biases of Demographic Parity-based Fair Learning Algorithms

Haoyu Lei, Amin Gohari, Farzan Farnia

TL;DR

Demographic parity-based fair learning can introduce inductive biases that favor the majority sensitive attribute when data are imbalanced. The authors analytically characterize this bias under DP constraints, and propose a sensitive-attribute distributional robustness (SA-DRO) framework to mitigate it by optimizing a worst-case DP-regularized loss over nearby distributions of the sensitive attribute. They derive theoretical bounds for DP and related dependence measures, and validate the approach with centralized and federated experiments on COMPAS, Adult, and CelebA, showing reduced bias with modest accuracy trade-offs. The work highlights practical implications for robust DP-based fairness and offers a tractable path to balancing fairness and predictive performance in heterogeneous deployment environments.

Abstract

Fair supervised learning algorithms assigning labels with little dependence on a sensitive attribute have attracted great attention in the machine learning community. While the demographic parity (DP) notion has been frequently used to measure a model's fairness in training fair classifiers, several studies in the literature suggest potential impacts of enforcing DP in fair learning algorithms. In this work, we analytically study the effect of standard DP-based regularization methods on the conditional distribution of the predicted label given the sensitive attribute. Our analysis shows that an imbalanced training dataset with a non-uniform distribution of the sensitive attribute could lead to a classification rule biased toward the sensitive attribute outcome holding the majority of training data. To control such inductive biases in DP-based fair learning, we propose a sensitive attribute-based distributionally robust optimization (SA-DRO) method improving robustness against the marginal distribution of the sensitive attribute. Finally, we present several numerical results on the application of DP-based learning methods to standard centralized and distributed learning problems. The empirical findings support our theoretical results on the inductive biases in DP-based fair learning algorithms and the debiasing effects of the proposed SA-DRO method.

On the Inductive Biases of Demographic Parity-based Fair Learning Algorithms

TL;DR

Demographic parity-based fair learning can introduce inductive biases that favor the majority sensitive attribute when data are imbalanced. The authors analytically characterize this bias under DP constraints, and propose a sensitive-attribute distributional robustness (SA-DRO) framework to mitigate it by optimizing a worst-case DP-regularized loss over nearby distributions of the sensitive attribute. They derive theoretical bounds for DP and related dependence measures, and validate the approach with centralized and federated experiments on COMPAS, Adult, and CelebA, showing reduced bias with modest accuracy trade-offs. The work highlights practical implications for robust DP-based fairness and offers a tractable path to balancing fairness and predictive performance in heterogeneous deployment environments.

Abstract

Fair supervised learning algorithms assigning labels with little dependence on a sensitive attribute have attracted great attention in the machine learning community. While the demographic parity (DP) notion has been frequently used to measure a model's fairness in training fair classifiers, several studies in the literature suggest potential impacts of enforcing DP in fair learning algorithms. In this work, we analytically study the effect of standard DP-based regularization methods on the conditional distribution of the predicted label given the sensitive attribute. Our analysis shows that an imbalanced training dataset with a non-uniform distribution of the sensitive attribute could lead to a classification rule biased toward the sensitive attribute outcome holding the majority of training data. To control such inductive biases in DP-based fair learning, we propose a sensitive attribute-based distributionally robust optimization (SA-DRO) method improving robustness against the marginal distribution of the sensitive attribute. Finally, we present several numerical results on the application of DP-based learning methods to standard centralized and distributed learning problems. The empirical findings support our theoretical results on the inductive biases in DP-based fair learning algorithms and the debiasing effects of the proposed SA-DRO method.
Paper Structure (24 sections, 8 theorems, 52 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 24 sections, 8 theorems, 52 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Consider fair learning problem Eq: Fair Classification 1 where $\rho$ is the DDP function and $\mathcal{F}$ is the space of all randomized maps generating all conditional distribution $P_{\widehat{Y}|\mathbf{X},S}$'s. Suppose that $Y= h(\mathbf{X},S)$ is a deterministic function $h$ of $\mathbf{X},S

Figures (8)

  • Figure 1: The first two columns show the trade-off between accuracy and DDP on the COMPAS and Adult dataset by applying NN-based fair classification methods, while the third column shows that the $\mathrm{NR}(s)$ for each subgroup $s\in\{0,1\}$ will converge to near the majority sensitive attribute.
  • Figure 2: The first two columns show the trade-off between accuracy and DDP on the COMPAS and Adult dataset by applying LR-based fair classification methods, while the third column shows that the $\mathrm{NR}(s)$ for each subgroup $s\in\{0,1\}$ will converge to near the majority sensitive attribute.
  • Figure 3: Both (a) and (b) show the trade-off between accuracy and DDP on the imbalanced CelebA dataset by applying MI fair classification method, while (c) shows that the $\mathrm{NR}(s)$ for each subgroup will converge to the majority, thus causing more discrimination on the minority group.
  • Figure 4: Blond hair samples (Majority, Upper) and Non-blond hair samples (Minority, Lower) in CelebA Dataset predicted by ERM(NN) and MI respectively. The results show that the model has 57.3% and 98.8% negative rates, i.e. prefers to predict all samples being female in Minority, even maintaining almost the same level of accuracy in the whole group.
  • Figure 5: Accuracy, DDP, and $\mathrm{NR}(s)$ values attained by SA-DRO while varying the Lagrangian coefficient of the DRO regularization term on COMPAS (upper) and Adult (lower) datasets.
  • ...and 3 more figures

Theorems & Definitions (16)

  • Theorem 1
  • proof
  • Corollary 1
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • Remark 1
  • Theorem 4
  • proof
  • ...and 6 more