Adaptive Heterogeneous Client Sampling for Federated Learning over Wireless Networks
Bing Luo, Wenli Xiao, Shiqiang Wang, Jianwei Huang, Leandros Tassiulas
TL;DR
This work tackles wall-clock time minimization for federated learning over bandwidth-limited wireless networks by jointly optimizing adaptive, heterogeneous client sampling and bandwidth allocation. It derives a tractable convergence bound for arbitrary sampling, formulates an approximate non-convex objective, and develops a practical two-stage method to learn unknown parameters and compute an effective sampling distribution. The proposed scheme demonstrates significant reductions in convergence time compared with baselines on hardware prototypes and simulations, including non-convex loss scenarios. The results reveal how system heterogeneity (computation and communication times) and statistical heterogeneity (data quality/quantity) interact to shape optimal client sampling, and show a clear trade-off in the number of sampled clients per round.
Abstract
Federated learning (FL) algorithms usually sample a fraction of clients in each round (partial participation) when the number of participants is large and the server's communication bandwidth is limited. Recent works on the convergence analysis of FL have focused on unbiased client sampling, e.g., sampling uniformly at random, which suffers from slow wall-clock time for convergence due to high degrees of system heterogeneity and statistical heterogeneity. This paper aims to design an adaptive client sampling algorithm for FL over wireless networks that tackles both system and statistical heterogeneity to minimize the wall-clock convergence time. We obtain a new tractable convergence bound for FL algorithms with arbitrary client sampling probability. Based on the bound, we analytically establish the relationship between the total learning time and sampling probability with an adaptive bandwidth allocation scheme, which results in a non-convex optimization problem. We design an efficient algorithm for learning the unknown parameters in the convergence bound and develop a low-complexity algorithm to approximately solve the non-convex problem. Our solution reveals the impact of system and statistical heterogeneity parameters on the optimal client sampling design. Moreover, our solution shows that as the number of sampled clients increases, the total convergence time first decreases and then increases because a larger sampling number reduces the number of rounds for convergence but results in a longer expected time per-round due to limited wireless bandwidth. Experimental results from both hardware prototype and simulation demonstrate that our proposed sampling scheme significantly reduces the convergence time compared to several baseline sampling schemes.
