Table of Contents
Fetching ...

Federated fairness-aware classification under differential privacy

Gengyu Xue, Yi Yu

Abstract

Privacy and algorithmic fairness have become two central issues in modern machine learning. Although each has separately emerged as a rapidly growing research area, their joint effect remains comparatively under-explored. In this paper, we systematically study the joint impact of differential privacy and fairness on classification in a federated setting, where data are distributed across multiple servers. Targeting demographic disparity constrained classification under federated differential privacy, we propose a two-step algorithm, namely FDP-Fair. In the special case where there is only one server, we further propose a simple yet powerful algorithm, namely CDP-Fair, serving as a computationally-lightweight alternative. Under mild structural assumptions, theoretical guarantees on privacy, fairness and excess risk control are established. In particular, we disentangle the source of the private fairness-aware excess risk into a) intrinsic cost of classification, b) cost of private classification, c) non-private cost of fairness and d) private cost of fairness. Our theoretical findings are complemented by extensive numerical experiments on both synthetic and real datasets, highlighting the practicality of our designed algorithms.

Federated fairness-aware classification under differential privacy

Abstract

Privacy and algorithmic fairness have become two central issues in modern machine learning. Although each has separately emerged as a rapidly growing research area, their joint effect remains comparatively under-explored. In this paper, we systematically study the joint impact of differential privacy and fairness on classification in a federated setting, where data are distributed across multiple servers. Targeting demographic disparity constrained classification under federated differential privacy, we propose a two-step algorithm, namely FDP-Fair. In the special case where there is only one server, we further propose a simple yet powerful algorithm, namely CDP-Fair, serving as a computationally-lightweight alternative. Under mild structural assumptions, theoretical guarantees on privacy, fairness and excess risk control are established. In particular, we disentangle the source of the private fairness-aware excess risk into a) intrinsic cost of classification, b) cost of private classification, c) non-private cost of fairness and d) private cost of fairness. Our theoretical findings are complemented by extensive numerical experiments on both synthetic and real datasets, highlighting the practicality of our designed algorithms.

Paper Structure

This paper contains 42 sections, 43 theorems, 236 equations, 9 figures, 6 algorithms.

Key Result

Theorem 1

Denote $\widetilde{f}^{\mathrm{FDP}}_{\mathop{\mathrm{\mathrm{DD}}}\limits,\alpha}$ and $\widetilde{f}^{\mathrm{CDP}}_{\mathop{\mathrm{\mathrm{DD}}}\limits,\alpha}$ the output of Algorithms alg_fair_fdp and alg_fair_cdp respectively. Then under Assumptions a_prob, a_kernel and a_posterior, the follo

Figures (9)

  • Figure 1: An illustration of the framework we consider in the problem of fairness-aware classification in a distributed setting.
  • Figure 2: Graphical illustration of S2. of \ref{['alg_fair_cdp']} when $\alpha = 0.1$.
  • Figure 3: Means and $95\%$ confidence bands for misclassification errors and empirical disparities of \ref{['alg_fair_cdp']} when $N \in \{5000, 7000, 9000\}$. The grey dashed line represents $y=x$.
  • Figure 4: Means and $95\%$ confidence bands for misclassification errors and empirical disparities of \ref{['alg_fair_fdp']} when $N_s = 2000$ and $S \in \{4,5,6\}$. The grey dashed line represents $y=x$.
  • Figure 5: Means and $95\%$ confidence bands for misclassification errors and empirical disparities of Algorithms \ref{['alg_fair_fdp']} and \ref{['alg_fair_cdp']} when $N_{\text{total}} = 7200$ and $S \in \{1,2,3,4,5\}$. The grey dashed line represents $y=x$.
  • ...and 4 more figures

Theorems & Definitions (89)

  • Definition 1: Central differential privacy, CDP
  • Definition 2: Federated differential privacy, FDP
  • Definition 3: Randomised classifier
  • Definition 4: Demographic disparity
  • Remark 1
  • Remark 2
  • Theorem 1
  • Remark 3
  • Definition 5: Fairness-aware excess risk under DD, Definition 4.1 in zeng2024minimax
  • Remark 4
  • ...and 79 more