Table of Contents
Fetching ...

The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy

Jiating Ma, Yipeng Zhou, Qi Li, Quan Z. Sheng, Laizhong Cui, Jiangchuan Liu

TL;DR

The paper addresses DP-enabled federated learning under heterogeneous client privacy budgets by deriving a convergence bound that exposes how privacy noise, data heterogeneity, and biased client selection affect model utility. It then casts client selection as a convex optimization problem over per-client participation counts, yielding both an approximate and an exact solution strategy (DPFL-BCS) that adaptively biases client participation to maximize utility. A two-stage approach is proposed: first estimate problem-related parameters, then solve for optimal participation counts to realize the bias-aware schedule. Extensive experiments across five datasets and both Gaussian and Laplace DP mechanisms show that DPFL-BCS consistently outperforms state-of-the-art baselines, achieving substantial utility gains under strong privacy heterogeneity and data non-IIDness, with practical, low overhead.

Abstract

To preserve the data privacy, the federated learning (FL) paradigm emerges in which clients only expose model gradients rather than original data for conducting model training. To enhance the protection of model gradients in FL, differentially private federated learning (DPFL) is proposed which incorporates differentially private (DP) noises to obfuscate gradients before they are exposed. Yet, an essential but largely overlooked problem in DPFL is the heterogeneity of clients' privacy requirement, which can vary significantly between clients and extremely complicates the client selection problem in DPFL. In other words, both the data quality and the influence of DP noises should be taken into account when selecting clients. To address this problem, we conduct convergence analysis of DPFL under heterogeneous privacy, a generic client selection strategy, popular DP mechanisms and convex loss. Based on convergence analysis, we formulate the client selection problem to minimize the value of loss function in DPFL with heterogeneous privacy, which is a convex optimization problem and can be solved efficiently. Accordingly, we propose the DPFL-BCS (biased client selection) algorithm. The extensive experiment results with real datasets under both convex and non-convex loss functions indicate that DPFL-BCS can remarkably improve model utility compared with the SOTA baselines.

The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy

TL;DR

The paper addresses DP-enabled federated learning under heterogeneous client privacy budgets by deriving a convergence bound that exposes how privacy noise, data heterogeneity, and biased client selection affect model utility. It then casts client selection as a convex optimization problem over per-client participation counts, yielding both an approximate and an exact solution strategy (DPFL-BCS) that adaptively biases client participation to maximize utility. A two-stage approach is proposed: first estimate problem-related parameters, then solve for optimal participation counts to realize the bias-aware schedule. Extensive experiments across five datasets and both Gaussian and Laplace DP mechanisms show that DPFL-BCS consistently outperforms state-of-the-art baselines, achieving substantial utility gains under strong privacy heterogeneity and data non-IIDness, with practical, low overhead.

Abstract

To preserve the data privacy, the federated learning (FL) paradigm emerges in which clients only expose model gradients rather than original data for conducting model training. To enhance the protection of model gradients in FL, differentially private federated learning (DPFL) is proposed which incorporates differentially private (DP) noises to obfuscate gradients before they are exposed. Yet, an essential but largely overlooked problem in DPFL is the heterogeneity of clients' privacy requirement, which can vary significantly between clients and extremely complicates the client selection problem in DPFL. In other words, both the data quality and the influence of DP noises should be taken into account when selecting clients. To address this problem, we conduct convergence analysis of DPFL under heterogeneous privacy, a generic client selection strategy, popular DP mechanisms and convex loss. Based on convergence analysis, we formulate the client selection problem to minimize the value of loss function in DPFL with heterogeneous privacy, which is a convex optimization problem and can be solved efficiently. Accordingly, we propose the DPFL-BCS (biased client selection) algorithm. The extensive experiment results with real datasets under both convex and non-convex loss functions indicate that DPFL-BCS can remarkably improve model utility compared with the SOTA baselines.
Paper Structure (30 sections, 9 theorems, 14 equations, 14 figures, 3 tables, 2 algorithms)

This paper contains 30 sections, 9 theorems, 14 equations, 14 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

(Gaussian Mechanism dwork2014algorithmic). Let $\epsilon\in(0,1)$. Given the dataset $\mathcal{D}$ and query input $\mathbf{w}\in \mathbb{R}^d$ where $d$ is the dimension of input parameters $\mathbf{w}$, the Gaussian mechanism satisfying $(\epsilon, \delta)$-DP distorts query results with $\mathcal

Figures (14)

  • Figure 1: A specific example to illustrate the influence of heterogeneous privacy and heterogeneous data on client selection in DPFL.
  • Figure 2: Model utility comparison of different algorithms under fixed privacy heterogeneity (GM = Gaussian Mechanism, LM = Laplace Mechanism).
  • Figure 3: The estimation of problem-related parameter $\Gamma_n$ and the optimal client selection decision $T_n^*$ in MNIST with Gaussian Mechanism.
  • Figure 4: Final model utility comparison of different algorithms by varying $\epsilon_{max}$ (GM = Gaussian Mechanism, LM = Laplace Mechanism).
  • Figure 5: Final model utility comparison of different algorithms by varying $\alpha$ (GM = Gaussian Mechanism, LM = Laplace Mechanism).
  • ...and 9 more figures

Theorems & Definitions (15)

  • Definition 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 1
  • Lemma 2
  • Definition 2
  • Definition 3
  • Theorem 5
  • ...and 5 more