The Power of Bias: Optimizing Client Selection in Federated Learning with Heterogeneous Differential Privacy
Jiating Ma, Yipeng Zhou, Qi Li, Quan Z. Sheng, Laizhong Cui, Jiangchuan Liu
TL;DR
The paper addresses DP-enabled federated learning under heterogeneous client privacy budgets by deriving a convergence bound that exposes how privacy noise, data heterogeneity, and biased client selection affect model utility. It then casts client selection as a convex optimization problem over per-client participation counts, yielding both an approximate and an exact solution strategy (DPFL-BCS) that adaptively biases client participation to maximize utility. A two-stage approach is proposed: first estimate problem-related parameters, then solve for optimal participation counts to realize the bias-aware schedule. Extensive experiments across five datasets and both Gaussian and Laplace DP mechanisms show that DPFL-BCS consistently outperforms state-of-the-art baselines, achieving substantial utility gains under strong privacy heterogeneity and data non-IIDness, with practical, low overhead.
Abstract
To preserve the data privacy, the federated learning (FL) paradigm emerges in which clients only expose model gradients rather than original data for conducting model training. To enhance the protection of model gradients in FL, differentially private federated learning (DPFL) is proposed which incorporates differentially private (DP) noises to obfuscate gradients before they are exposed. Yet, an essential but largely overlooked problem in DPFL is the heterogeneity of clients' privacy requirement, which can vary significantly between clients and extremely complicates the client selection problem in DPFL. In other words, both the data quality and the influence of DP noises should be taken into account when selecting clients. To address this problem, we conduct convergence analysis of DPFL under heterogeneous privacy, a generic client selection strategy, popular DP mechanisms and convex loss. Based on convergence analysis, we formulate the client selection problem to minimize the value of loss function in DPFL with heterogeneous privacy, which is a convex optimization problem and can be solved efficiently. Accordingly, we propose the DPFL-BCS (biased client selection) algorithm. The extensive experiment results with real datasets under both convex and non-convex loss functions indicate that DPFL-BCS can remarkably improve model utility compared with the SOTA baselines.
