Table of Contents
Fetching ...

Fairness Meets Privacy: Integrating Differential Privacy and Demographic Parity in Multi-class Classification

Lilian Say, Christophe Denis, Rafael Pinot

TL;DR

The paper tackles the challenge of simultaneously preserving data privacy and ensuring fairness in multi-class classification. It proposes DP2DP, a two-phase post-processing pipeline that first builds a differentially private probabilistic classifier on labeled data and then enforces $\rho$-demographic parity using unlabeled data with a privacy-preserving optimization. The authors establish a Rényi DP guarantee for DP2DP and prove a fairness bound showing the unfairness gap to $\rho$ decays at $\mathcal{O}(\log(N)/\sqrt{N})$ up to constants and smoothing error, aligning with non-private baselines up to a logarithmic factor. Empirically, DP2DP achieves state-of-the-art accuracy/fairness/privacy trade-offs on synthetic data and real-world datasets (notably the Adult benchmark), demonstrating that privacy and fairness can be integrated with only mild performance overhead and without sacrificing practical utility.

Abstract

The increasing use of machine learning in sensitive applications demands algorithms that simultaneously preserve data privacy and ensure fairness across potentially sensitive sub-populations. While privacy and fairness have each been extensively studied, their joint treatment remains poorly understood. Existing research often frames them as conflicting objectives, with multiple studies suggesting that strong privacy notions such as differential privacy inevitably compromise fairness. In this work, we challenge that perspective by showing that differential privacy can be integrated into a fairness-enhancing pipeline with minimal impact on fairness guarantees. We design a postprocessing algorithm, called DP2DP, that enforces both demographic parity and differential privacy. Our analysis reveals that our algorithm converges towards its demographic parity objective at essentially the same rate (up logarithmic factor) as the best non-private methods from the literature. Experiments on both synthetic and real datasets confirm our theoretical results, showing that the proposed algorithm achieves state-of-the-art accuracy/fairness/privacy trade-offs.

Fairness Meets Privacy: Integrating Differential Privacy and Demographic Parity in Multi-class Classification

TL;DR

The paper tackles the challenge of simultaneously preserving data privacy and ensuring fairness in multi-class classification. It proposes DP2DP, a two-phase post-processing pipeline that first builds a differentially private probabilistic classifier on labeled data and then enforces -demographic parity using unlabeled data with a privacy-preserving optimization. The authors establish a Rényi DP guarantee for DP2DP and prove a fairness bound showing the unfairness gap to decays at up to constants and smoothing error, aligning with non-private baselines up to a logarithmic factor. Empirically, DP2DP achieves state-of-the-art accuracy/fairness/privacy trade-offs on synthetic data and real-world datasets (notably the Adult benchmark), demonstrating that privacy and fairness can be integrated with only mild performance overhead and without sacrificing practical utility.

Abstract

The increasing use of machine learning in sensitive applications demands algorithms that simultaneously preserve data privacy and ensure fairness across potentially sensitive sub-populations. While privacy and fairness have each been extensively studied, their joint treatment remains poorly understood. Existing research often frames them as conflicting objectives, with multiple studies suggesting that strong privacy notions such as differential privacy inevitably compromise fairness. In this work, we challenge that perspective by showing that differential privacy can be integrated into a fairness-enhancing pipeline with minimal impact on fairness guarantees. We design a postprocessing algorithm, called DP2DP, that enforces both demographic parity and differential privacy. Our analysis reveals that our algorithm converges towards its demographic parity objective at essentially the same rate (up logarithmic factor) as the best non-private methods from the literature. Experiments on both synthetic and real datasets confirm our theoretical results, showing that the proposed algorithm achieves state-of-the-art accuracy/fairness/privacy trade-offs.

Paper Structure

This paper contains 33 sections, 21 theorems, 114 equations, 4 figures, 1 table, 2 algorithms.

Key Result

Theorem 4.1

Consider the DP2DP scheme, as in Algorithm algo:dp-fair. If the step-size sequence is such that $\eta_t = \eta \leq 2\beta$ for all $t \in [T]$, then DP2DP satisfies $(\alpha, \varepsilon)$-Rényi differential privacy for all $\alpha \geq 1$, where where $\Psi := \Psi\left(T, b ,N, \eta, \sigma_{\rm{SGD}} \right)$ is defined as $Q = S_\alpha\!\left(\frac{b}{N}, \frac{b \sigma_2}{4}\right)$, and f

Figures (4)

  • Figure 1: Synthetic experiments with $d=20$ features, $K=6$ classes, $10{,}000$ samples. Left: varying fairness tolerance $\rho$ ($p=0.75$). Right: varying unfairness parameter $p$. The combination of Phase 1 and Phase 2 satisfy $(0.46, 10^{-5})$-differential privacy.
  • Figure 2: Comparison of our method (in terms of fairness/misclassification) with previous work on the Adult dataset under $(\varepsilon, \delta)$-differential privacy with $\delta = 10^{-5}$. The left panel shows $\varepsilon = 0.5$, and the right panel for $\varepsilon = 1.0$.
  • Figure 3: Comparison of our method (in terms of fairness/misclassification) with previous work on the Default-CCC dataset under $(\varepsilon, \delta)$-differential privacy with $\delta = 10^{-5}$. The left panel shows $\varepsilon = 0.5$, and the right panel for $\varepsilon = 1.0$.
  • Figure 4: Comparison of our method (in terms of fairness/misclassification) with previous work on the Parkinson dataset under $(\varepsilon, \delta)$-differential privacy with $\delta = 10^{-5}$.

Theorems & Definitions (42)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Theorem 4.1
  • proof : Skecth of Proof
  • Theorem 5.1
  • proof : Proof sketch
  • Corollary 5.1
  • Definition A.1: $L$-Lipschitz continuity
  • Definition A.2: $\beta$-smoothness
  • ...and 32 more