Federated Learning in the Presence of Adversarial Client Unavailability
Lili Su, Ming Xiang, Jiaming Xu, Pengkun Yang
TL;DR
The paper addresses federated learning under adversarial, history-dependent client unavailability by introducing an $\epsilon$-adversary dropout model and analyzing simple FedAvg/FedProx variants under $(B,G)$-bounded dissimilarity. It establishes near-optimal estimation-error bounds for non-convex and strongly convex objectives, showing error scales as $\epsilon(G^2+\sigma^2)$ or $\epsilon(G^2+\sigma^2)/\mu^2$, respectively, and proves minimax lower bounds that match up to constants. It derives convergence rates of $O(1/\sqrt{T})$ for non-convex and $O(1/T)$ for strongly convex cases, demonstrating these are optimal for first-order methods with noisy gradients. Theoretical results are complemented by extensive experiments on CIFAR-10, Shakespeare NLP, and synthetic data, showing the proposed FedAvg/FedProx variants outperform baselines under adversarial dropout while remaining memory-efficient and scalable. Overall, the work provides a rigorous treatment of adversarial client unavailability and offers practical, provably robust FL algorithms for challenging real-world deployments.
Abstract
Federated learning is a decentralized machine learning framework that enables collaborative model training without revealing raw data. Due to the diverse hardware and software limitations, a client may not always be available for the computation requests from the parameter server. An emerging line of research is devoted to tackling arbitrary client unavailability. However, existing work still imposes structural assumptions on the unavailability patterns, impeding their applicability in challenging scenarios wherein the unavailability patterns are beyond the control of the parameter server. Moreover, in harsh environments like battlefields, adversaries can selectively and adaptively silence specific clients. In this paper, we relax the structural assumptions and consider adversarial client unavailability. To quantify the degrees of client unavailability, we use the notion of $ε$-adversary dropout fraction. We show that simple variants of FedAvg or FedProx, albeit completely agnostic to $ε$, converge to an estimation error on the order of $ε(G^2 + σ^2)$ for non-convex global objectives and $ε(G^2 + σ^2)/μ^2$ for $μ$ strongly convex global objectives, where $G$ is a heterogeneity parameter and $σ^2$ is the noise level. Conversely, we prove that any algorithm has to suffer an estimation error of at least $ε(G^2 + σ^2)/8$ and $ε(G^2 + σ^2)/(8μ^2)$ for non-convex global objectives and $μ$-strongly convex global objectives. Furthermore, the convergence speeds of the FedAvg or FedProx variants are $O(1/\sqrt{T})$ for non-convex objectives and $O(1/T)$ for strongly-convex objectives, both of which are the best possible for any first-order method that only has access to noisy gradients.
