On the Vulnerability of Fairness Constrained Learning to Malicious Noise

Avrim Blum; Princewill Okoroafor; Aadirupa Saha; Kevin Stangl

On the Vulnerability of Fairness Constrained Learning to Malicious Noise

Avrim Blum, Princewill Okoroafor, Aadirupa Saha, Kevin Stangl

TL;DR

This work analyzes how fairness-constrained learning responds to malicious noise and introduces a randomized post-processing mechanism, the $(P,Q)$-expansion, to enable improper learning and improve robustness. The results reveal a nuanced, notion-dependent robustness: Demographic Parity can withstand adversarial perturbations with only $O(\\alpha)$ accuracy loss, Equal Opportunity matches the same order with $O(\\sqrt{\\alpha})$ and a matching lower bound, while Equalized Odds and related notions suffer $\\Omega(1)$ losses in the worst case. Calibration-related notions, including Predictive Parity and Parity Calibration, exhibit regimes where $O(\\alpha)$, $O(\\sqrt{\\alpha})$, or $\\Omega(1)$ losses arise, depending on group sizes and calibration strategies. Overall, the paper provides a finer-grained view of fairness robustness under adversarial corruption and demonstrates that randomized, improper learning can significantly alter the robustness landscape for group fairness constraints, with implications for deploying fair learning in noisy real-world data.

Abstract

We consider the vulnerability of fairness-constrained learning to small amounts of malicious noise in the training data. Konstantinov and Lampert (2021) initiated the study of this question and presented negative results showing there exist data distributions where for several fairness constraints, any proper learner will exhibit high vulnerability when group sizes are imbalanced. Here, we present a more optimistic view, showing that if we allow randomized classifiers, then the landscape is much more nuanced. For example, for Demographic Parity we show we can incur only a $Θ(α)$ loss in accuracy, where $α$ is the malicious noise rate, matching the best possible even without fairness constraints. For Equal Opportunity, we show we can incur an $O(\sqrtα)$ loss, and give a matching $Ω(\sqrtα)$lower bound. In contrast, Konstantinov and Lampert (2021) showed for proper learners the loss in accuracy for both notions is $Ω(1)$. The key technical novelty of our work is how randomization can bypass simple "tricks" an adversary can use to amplify his power. We also consider additional fairness notions including Equalized Odds and Calibration. For these fairness notions, the excess accuracy clusters into three natural regimes $O(α)$,$O(\sqrtα)$ and $O(1)$. These results provide a more fine-grained view of the sensitivity of fairness-constrained learning to adversarial noise in training data.

On the Vulnerability of Fairness Constrained Learning to Malicious Noise

TL;DR

This work analyzes how fairness-constrained learning responds to malicious noise and introduces a randomized post-processing mechanism, the

-expansion, to enable improper learning and improve robustness. The results reveal a nuanced, notion-dependent robustness: Demographic Parity can withstand adversarial perturbations with only

accuracy loss, Equal Opportunity matches the same order with

and a matching lower bound, while Equalized Odds and related notions suffer

losses in the worst case. Calibration-related notions, including Predictive Parity and Parity Calibration, exhibit regimes where

, or

losses arise, depending on group sizes and calibration strategies. Overall, the paper provides a finer-grained view of fairness robustness under adversarial corruption and demonstrates that randomized, improper learning can significantly alter the robustness landscape for group fairness constraints, with implications for deploying fair learning in noisy real-world data.

Abstract

loss in accuracy, where

is the malicious noise rate, matching the best possible even without fairness constraints. For Equal Opportunity, we show we can incur an

loss, and give a matching

lower bound. In contrast, Konstantinov and Lampert (2021) showed for proper learners the loss in accuracy for both notions is

. The key technical novelty of our work is how randomization can bypass simple "tricks" an adversary can use to amplify his power. We also consider additional fairness notions including Equalized Odds and Calibration. For these fairness notions, the excess accuracy clusters into three natural regimes

and

. These results provide a more fine-grained view of the sensitivity of fairness-constrained learning to adversarial noise in training data.

Paper Structure (23 sections, 14 theorems, 36 equations)

This paper contains 23 sections, 14 theorems, 36 equations.

Introduction
Our Contributions
Related Work
Group Fairness Discussion
Preliminaries
Fairness Notions
Adversary Model
Core Learning Problem
Main Results: Demographic Parity, Equal Opportunity and Equalized Odds
Demographic Parity
Equal Opportunity
Equalized Odds
Main Results: Calibration
Predictive Parity Lower Bound
Extension to Finer Grained Hypothesis Classes
...and 8 more sections

Key Result

Proposition 2

Let $\widetilde{\mathcal{D}}$ be any corrupted distribution chosen by the adversary, and $h$ be a fixed hypothesis in $\mathcal{H}$. For a fixed group $A$, the following inequality bounds the change in the proportion of positive labels assigned by $h$: $\left| P_{(x,y) \sim \widetilde{\mathcal{D}}_A

Theorems & Definitions (19)

Definition 1: $\mathop{\mathrm{\mathcal{P}\mathcal{Q}(\mathcal{H})}}\nolimits$
Definition 2
Proposition 2: Parity after corruption
Theorem 3
Proposition 3: TPR after corruption
Theorem 4: Upper Bound
Theorem 5: Lower Bound
Theorem 6: Lower Bound
Definition 7: Predictive Parity chouldechova2017fair
Theorem 8
...and 9 more

On the Vulnerability of Fairness Constrained Learning to Malicious Noise

TL;DR

Abstract

On the Vulnerability of Fairness Constrained Learning to Malicious Noise

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (19)