Table of Contents
Fetching ...

A Generic Framework for Conformal Fairness

Aditya T. Vadlamani, Anutam Srinivasan, Pranav Maneriker, Ali Payani, Srinivasan Parthasarathy

TL;DR

This work introduces Conformal Fairness (CF), a framework that integrates fairness constraints into conformal prediction by controlling conditional coverage gaps across sensitive groups under data exchangeability. By filtering calibration data with group/label-conditioned fairness metrics and selecting an optimal threshold $\lambda$ over non-conformity scores, CF can enforce a user-specified closeness $c$ for various metrics, including Demographic Parity, Equal Opportunity, and Predictive Parity, while maintaining distribution-free coverage guarantees. The framework supports multiple non-conformity scores, extends to graph data due to exchangeability, and provides fairness auditing capabilities without requiring group labels at inference. Empirical results on graph and tabular datasets show that CF can significantly reduce fairness disparity and achieve near regulatory disparity bounds (e.g., Four-Fifths Rule) with modest efficiency trade-offs, including successful handling of intersectional fairness and predictive parity proxies. Overall, CF offers a versatile, theoretically grounded approach to fair uncertainty quantification with practical implications for auditing and deploying fair conformal predictors in complex domains.

Abstract

Conformal Prediction (CP) is a popular method for uncertainty quantification with machine learning models. While conformal prediction provides probabilistic guarantees regarding the coverage of the true label, these guarantees are agnostic to the presence of sensitive attributes within the dataset. In this work, we formalize \textit{Conformal Fairness}, a notion of fairness using conformal predictors, and provide a theoretically well-founded algorithm and associated framework to control for the gaps in coverage between different sensitive groups. Our framework leverages the exchangeability assumption (implicit to CP) rather than the typical IID assumption, allowing us to apply the notion of Conformal Fairness to data types and tasks that are not IID, such as graph data. Experiments were conducted on graph and tabular datasets to demonstrate that the algorithm can control fairness-related gaps in addition to coverage aligned with theoretical expectations.

A Generic Framework for Conformal Fairness

TL;DR

This work introduces Conformal Fairness (CF), a framework that integrates fairness constraints into conformal prediction by controlling conditional coverage gaps across sensitive groups under data exchangeability. By filtering calibration data with group/label-conditioned fairness metrics and selecting an optimal threshold over non-conformity scores, CF can enforce a user-specified closeness for various metrics, including Demographic Parity, Equal Opportunity, and Predictive Parity, while maintaining distribution-free coverage guarantees. The framework supports multiple non-conformity scores, extends to graph data due to exchangeability, and provides fairness auditing capabilities without requiring group labels at inference. Empirical results on graph and tabular datasets show that CF can significantly reduce fairness disparity and achieve near regulatory disparity bounds (e.g., Four-Fifths Rule) with modest efficiency trade-offs, including successful handling of intersectional fairness and predictive parity proxies. Overall, CF offers a versatile, theoretically grounded approach to fair uncertainty quantification with practical implications for auditing and deploying fair conformal predictors in complex domains.

Abstract

Conformal Prediction (CP) is a popular method for uncertainty quantification with machine learning models. While conformal prediction provides probabilistic guarantees regarding the coverage of the true label, these guarantees are agnostic to the presence of sensitive attributes within the dataset. In this work, we formalize \textit{Conformal Fairness}, a notion of fairness using conformal predictors, and provide a theoretically well-founded algorithm and associated framework to control for the gaps in coverage between different sensitive groups. Our framework leverages the exchangeability assumption (implicit to CP) rather than the typical IID assumption, allowing us to apply the notion of Conformal Fairness to data types and tasks that are not IID, such as graph data. Experiments were conducted on graph and tabular datasets to demonstrate that the algorithm can control fairness-related gaps in addition to coverage aligned with theoretical expectations.

Paper Structure

This paper contains 54 sections, 10 theorems, 24 equations, 17 figures, 14 tables, 1 algorithm.

Key Result

Lemma 3.0

For any $(g, \tilde{y})\in\mathcal{G}\times\mathcal{Y}^+$, calibrating on $\mathcal{D_{\mathrm{calib}}}_{(g, \tilde{y})} = \{({\bm{x}}_i, y_i)~|~F_M({\bm{x}}_{i},y_{i}, g, \tilde{y}) = 1\}$ guarantees the following about the conditional coverage: The interval width is $\frac{1}{|\mathcal{D_{\mathrm{calib}}}_{(g, y)}| + 1}$.

Figures (17)

  • Figure 1: ACSIncome. The top plots are efficiency results, while the bottom are the fairness disparities for (a) APS, (b) RAPS, and (c) TPS. In all cases, our framework gives results at or better than the desired threshold and better than the baseline.
  • Figure 2: Credit. The top plots are efficiency results, while the bottom are the fairness disparities for (a) APS, (b) CFGNN, (c) DAPS, (d) RAPS, and (e) TPS. In all cases, our framework achieves the desired coverage gap better than the baseline, with a minor impact on efficiency.
  • Figure 3: Pokec-n using both sensitive attributes. The top plots are the efficiency results, while the bottom plots are the fairness disparities for (a) APS, (b) CFGNN, (c) DAPS, (d) RAPS, and (e) TPS. CFGNN (b) and DAPS (c) achieve the desired fairness coverage thresholds better than standard CP methods.
  • Figure E1: ACSEducation. Comparison of efficiencies when using the CF Framework without (top) and with (bottom) classwise lambdas. We observe that the efficiencies are better in the right plot. This is because $\forall_{i,\hdots,k}~ \lambda_{\text{non-classwise}} \geq \lambda_{\text{classwise}}^{i}$ (k is the number of classes), which causes fewer labels to be included in the prediction set, thus improving efficiency with the classwise approach. For some experiments, the fairness disparity is $0$ (e.g., APS and RAPS in the no-classwise setting), because the framework is producing the full prediction set--the trivial case--which means the coverage of $\tilde{y} \in \mathcal{Y}^+$ is $1.00$, thus causing the disparity to be $0$.
  • Figure E2: ACSEducation. Comparison of fairness disparities when using the CF Framework without (top) and with (bottom) classwise lambdas. We observe that the fairness disparities are better in the top plot. This is because by using a single $\lambda$, only the hardest-to-satisfy label will be at or around the coverage gap, $c$, unlike classwise, which ensures all labels will be at or around the coverage gap, $c$. Since fewer labels have coverages around the coverage gap, for non-classwise in (top), the likelihood of being above the threshold is limited - as opposed to the classwise approach (bottom).
  • ...and 12 more figures

Theorems & Definitions (16)

  • Lemma 3.0
  • Lemma 3.0
  • Lemma 3.0
  • Theorem 3.1
  • Lemma B.0
  • proof
  • Lemma B.1
  • proof : Proof Sketch
  • Lemma B.1
  • proof
  • ...and 6 more