Table of Contents
Fetching ...

Fairness Under Group-Conditional Prior Probability Shift: Invariance, Drift, and Target-Aware Post-Processing

Amir Asiaee, Kaveh Aryan

TL;DR

This work analyzes fairness under group-conditional prior probability shift (GPPS), where within-group feature-label relationships remain stable but group-specific label prevalences change across domains. It proves a clean dichotomy: equalized odds (and other separation-based criteria) are invariant under GPPS, while demographic parity and predictive parity can drift with prevalence; it also establishes shift-robust impossibility results for DP and PPV. The authors show that target risk and fairness gaps are identifiable without target labels, leveraging ROC invariance to estimate target performance from source data and unlabeled target data. They propose TAP-GPPS, a label-free post-processing pipeline that estimates target prevalences, corrects posteriors, and selects group-specific thresholds to meet target-domain DP with minimal utility loss, and validate it on synthetic and semi-synthetic benchmarks. The results provide actionable guidance for criterion selection and deployment monitoring in non-stationary environments with demographic heterogeneity.

Abstract

Machine learning systems are often trained and evaluated for fairness on historical data, yet deployed in environments where conditions have shifted. A particularly common form of shift occurs when the prevalence of positive outcomes changes differently across demographic groups--for example, when disease rates rise faster in one population than another, or when economic conditions affect loan default rates unequally. We study group-conditional prior probability shift (GPPS), where the label prevalence $P(Y=1\mid A=a)$ may change between training and deployment while the feature-generation process $P(X\mid Y,A)$ remains stable. Our analysis yields three main contributions. First, we prove a fundamental dichotomy: fairness criteria based on error rates (equalized odds) are structurally invariant under GPPS, while acceptance-rate criteria (demographic parity) can drift--and we prove this drift is unavoidable for non-trivial classifiers (shift-robust impossibility). Second, we show that target-domain risk and fairness metrics are identifiable without target labels: the invariance of ROC quantities under GPPS enables consistent estimation from source labels and unlabeled target data alone, with finite-sample guarantees. Third, we propose TAP-GPPS, a label-free post-processing algorithm that estimates prevalences from unlabeled data, corrects posteriors, and selects thresholds to satisfy demographic parity in the target domain. Experiments validate our theoretical predictions and demonstrate that TAP-GPPS achieves target fairness with minimal utility loss.

Fairness Under Group-Conditional Prior Probability Shift: Invariance, Drift, and Target-Aware Post-Processing

TL;DR

This work analyzes fairness under group-conditional prior probability shift (GPPS), where within-group feature-label relationships remain stable but group-specific label prevalences change across domains. It proves a clean dichotomy: equalized odds (and other separation-based criteria) are invariant under GPPS, while demographic parity and predictive parity can drift with prevalence; it also establishes shift-robust impossibility results for DP and PPV. The authors show that target risk and fairness gaps are identifiable without target labels, leveraging ROC invariance to estimate target performance from source data and unlabeled target data. They propose TAP-GPPS, a label-free post-processing pipeline that estimates target prevalences, corrects posteriors, and selects group-specific thresholds to meet target-domain DP with minimal utility loss, and validate it on synthetic and semi-synthetic benchmarks. The results provide actionable guidance for criterion selection and deployment monitoring in non-stationary environments with demographic heterogeneity.

Abstract

Machine learning systems are often trained and evaluated for fairness on historical data, yet deployed in environments where conditions have shifted. A particularly common form of shift occurs when the prevalence of positive outcomes changes differently across demographic groups--for example, when disease rates rise faster in one population than another, or when economic conditions affect loan default rates unequally. We study group-conditional prior probability shift (GPPS), where the label prevalence may change between training and deployment while the feature-generation process remains stable. Our analysis yields three main contributions. First, we prove a fundamental dichotomy: fairness criteria based on error rates (equalized odds) are structurally invariant under GPPS, while acceptance-rate criteria (demographic parity) can drift--and we prove this drift is unavoidable for non-trivial classifiers (shift-robust impossibility). Second, we show that target-domain risk and fairness metrics are identifiable without target labels: the invariance of ROC quantities under GPPS enables consistent estimation from source labels and unlabeled target data alone, with finite-sample guarantees. Third, we propose TAP-GPPS, a label-free post-processing algorithm that estimates prevalences from unlabeled data, corrects posteriors, and selects thresholds to satisfy demographic parity in the target domain. Experiments validate our theoretical predictions and demonstrate that TAP-GPPS achieves target fairness with minimal utility loss.
Paper Structure (63 sections, 15 theorems, 42 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 63 sections, 15 theorems, 42 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Lemma 4.1

Assume GPPS (Definition def:gpps). Fix any measurable score function $f: \mathcal{X}\times\mathcal{A}\to\mathbb{R}$. Then for all $a\in\mathcal{A}$, $y\in\{0,1\}$, and all measurable sets $B\subseteq\mathbb{R}$,

Figures (2)

  • Figure 1: Validation of invariance and drift predictions on synthetic data. (a) EO gap remains constant ($\approx 0.03$) across all prevalence shifts, confirming invariance under GPPS (Theorem \ref{['thm:eo-inv']}). (b) DP gap varies linearly with prevalence shift, matching theoretical predictions (dashed line) from Proposition \ref{['prop:ar-affine']}. (c) PPV gap varies nonlinearly with prevalence shift, following Proposition \ref{['prop:ppv']}. Error bars show $\pm 1$ standard deviation over 5 random seeds.
  • Figure 2: Sample complexity of TAP-GPPS on synthetic data. (a) DP gap decreases with unlabeled sample size, crossing $0.05$ with roughly $500$ target samples per group. (b) Prevalence estimation error follows the theoretical $O(1/\sqrt{m})$ rate. (c) Accuracy stabilizes quickly.

Theorems & Definitions (39)

  • Definition 2.1: Equalized Odds hardt2016equality
  • Definition 2.2: Demographic Parity calders2009building
  • Definition 2.3: Predictive Parity chouldechova2017fair
  • Definition 3.1: Group-Conditional Prior Probability Shift (GPPS)
  • Remark 3.2: Relation to standard label shift
  • Remark 3.3: Relation to demographic shift
  • Lemma 4.1: Score distribution invariance given $(Y,A)$
  • Corollary 4.2: Invariance of ROC quantities
  • Theorem 4.3: Invariance of equalized odds under GPPS
  • Remark 4.4: Practical implications
  • ...and 29 more