Table of Contents
Fetching ...

Tackling Privacy Heterogeneity in Differentially Private Federated Learning

Ruichen Xu, Ying-Jun Angela Zhang, Jianwei Huang

TL;DR

This work proposes a privacy-aware client selection strategy, formulated as a convex optimization problem, that adaptively adjusts selection probabilities to minimize training error and achieves up to a 10% improvement in test accuracy on CIFAR-10 compared to existing baselines under heterogeneous privacy budgets.

Abstract

Differentially private federated learning (DP-FL) enables clients to collaboratively train machine learning models while preserving the privacy of their local data. However, most existing DP-FL approaches assume that all clients share a uniform privacy budget, an assumption that does not hold in real-world scenarios where privacy requirements vary widely. This privacy heterogeneity poses a significant challenge: conventional client selection strategies, which typically rely on data quantity, cannot distinguish between clients providing high-quality updates and those introducing substantial noise due to strict privacy constraints. To address this gap, we present the first systematic study of privacy-aware client selection in DP-FL. We establish a theoretical foundation by deriving a convergence analysis that quantifies the impact of privacy heterogeneity on training error. Building on this analysis, we propose a privacy-aware client selection strategy, formulated as a convex optimization problem, that adaptively adjusts selection probabilities to minimize training error. Extensive experiments on benchmark datasets demonstrate that our approach achieves up to a 10% improvement in test accuracy on CIFAR-10 compared to existing baselines under heterogeneous privacy budgets. These results highlight the importance of incorporating privacy heterogeneity into client selection for practical and effective federated learning.

Tackling Privacy Heterogeneity in Differentially Private Federated Learning

TL;DR

This work proposes a privacy-aware client selection strategy, formulated as a convex optimization problem, that adaptively adjusts selection probabilities to minimize training error and achieves up to a 10% improvement in test accuracy on CIFAR-10 compared to existing baselines under heterogeneous privacy budgets.

Abstract

Differentially private federated learning (DP-FL) enables clients to collaboratively train machine learning models while preserving the privacy of their local data. However, most existing DP-FL approaches assume that all clients share a uniform privacy budget, an assumption that does not hold in real-world scenarios where privacy requirements vary widely. This privacy heterogeneity poses a significant challenge: conventional client selection strategies, which typically rely on data quantity, cannot distinguish between clients providing high-quality updates and those introducing substantial noise due to strict privacy constraints. To address this gap, we present the first systematic study of privacy-aware client selection in DP-FL. We establish a theoretical foundation by deriving a convergence analysis that quantifies the impact of privacy heterogeneity on training error. Building on this analysis, we propose a privacy-aware client selection strategy, formulated as a convex optimization problem, that adaptively adjusts selection probabilities to minimize training error. Extensive experiments on benchmark datasets demonstrate that our approach achieves up to a 10% improvement in test accuracy on CIFAR-10 compared to existing baselines under heterogeneous privacy budgets. These results highlight the importance of incorporating privacy heterogeneity into client selection for practical and effective federated learning.
Paper Structure (41 sections, 12 theorems, 84 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 41 sections, 12 theorems, 84 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

Suppose a mechanism $\mathcal{M}$ satisfies $(\log( \frac{(e^\epsilon-1)}{r}+1), \frac{\delta}{r})$-dp. If data is subsampled by a ratio $r$ without replacement, the mechanism under subsampling satisfies $(\epsilon, \delta)$-dp.

Figures (6)

  • Figure 1: Illustration of differentially private federated learning.
  • Figure 2: Workflow of the privacy-aware client selection.
  • Figure 3: Selection probability of 100 clients with heterogeneous dataset sizes and privacy budgets ($\epsilon$). Subsampling ratio $r$ is fixed to 0.1 for all clients.
  • Figure 4: Test accuracy on CIFAR-10.
  • Figure 5: Test accuracy versus batch size.
  • ...and 1 more figures

Theorems & Definitions (26)

  • Definition 1: Differential privacy
  • Lemma 1: Inverse version of subsampling Lemma balle2018privacy
  • Lemma 2: Gaussian mechanism with strong composition theorem kairouz2015composition
  • Theorem 1
  • Remark 1
  • Theorem 2
  • Corollary 1
  • Remark 2
  • Corollary 2
  • proof
  • ...and 16 more