Table of Contents
Fetching ...

Federated Learning in the Presence of Adversarial Client Unavailability

Lili Su, Ming Xiang, Jiaming Xu, Pengkun Yang

TL;DR

The paper addresses federated learning under adversarial, history-dependent client unavailability by introducing an $\epsilon$-adversary dropout model and analyzing simple FedAvg/FedProx variants under $(B,G)$-bounded dissimilarity. It establishes near-optimal estimation-error bounds for non-convex and strongly convex objectives, showing error scales as $\epsilon(G^2+\sigma^2)$ or $\epsilon(G^2+\sigma^2)/\mu^2$, respectively, and proves minimax lower bounds that match up to constants. It derives convergence rates of $O(1/\sqrt{T})$ for non-convex and $O(1/T)$ for strongly convex cases, demonstrating these are optimal for first-order methods with noisy gradients. Theoretical results are complemented by extensive experiments on CIFAR-10, Shakespeare NLP, and synthetic data, showing the proposed FedAvg/FedProx variants outperform baselines under adversarial dropout while remaining memory-efficient and scalable. Overall, the work provides a rigorous treatment of adversarial client unavailability and offers practical, provably robust FL algorithms for challenging real-world deployments.

Abstract

Federated learning is a decentralized machine learning framework that enables collaborative model training without revealing raw data. Due to the diverse hardware and software limitations, a client may not always be available for the computation requests from the parameter server. An emerging line of research is devoted to tackling arbitrary client unavailability. However, existing work still imposes structural assumptions on the unavailability patterns, impeding their applicability in challenging scenarios wherein the unavailability patterns are beyond the control of the parameter server. Moreover, in harsh environments like battlefields, adversaries can selectively and adaptively silence specific clients. In this paper, we relax the structural assumptions and consider adversarial client unavailability. To quantify the degrees of client unavailability, we use the notion of $ε$-adversary dropout fraction. We show that simple variants of FedAvg or FedProx, albeit completely agnostic to $ε$, converge to an estimation error on the order of $ε(G^2 + σ^2)$ for non-convex global objectives and $ε(G^2 + σ^2)/μ^2$ for $μ$ strongly convex global objectives, where $G$ is a heterogeneity parameter and $σ^2$ is the noise level. Conversely, we prove that any algorithm has to suffer an estimation error of at least $ε(G^2 + σ^2)/8$ and $ε(G^2 + σ^2)/(8μ^2)$ for non-convex global objectives and $μ$-strongly convex global objectives. Furthermore, the convergence speeds of the FedAvg or FedProx variants are $O(1/\sqrt{T})$ for non-convex objectives and $O(1/T)$ for strongly-convex objectives, both of which are the best possible for any first-order method that only has access to noisy gradients.

Federated Learning in the Presence of Adversarial Client Unavailability

TL;DR

The paper addresses federated learning under adversarial, history-dependent client unavailability by introducing an -adversary dropout model and analyzing simple FedAvg/FedProx variants under -bounded dissimilarity. It establishes near-optimal estimation-error bounds for non-convex and strongly convex objectives, showing error scales as or , respectively, and proves minimax lower bounds that match up to constants. It derives convergence rates of for non-convex and for strongly convex cases, demonstrating these are optimal for first-order methods with noisy gradients. Theoretical results are complemented by extensive experiments on CIFAR-10, Shakespeare NLP, and synthetic data, showing the proposed FedAvg/FedProx variants outperform baselines under adversarial dropout while remaining memory-efficient and scalable. Overall, the work provides a rigorous treatment of adversarial client unavailability and offers practical, provably robust FL algorithms for challenging real-world deployments.

Abstract

Federated learning is a decentralized machine learning framework that enables collaborative model training without revealing raw data. Due to the diverse hardware and software limitations, a client may not always be available for the computation requests from the parameter server. An emerging line of research is devoted to tackling arbitrary client unavailability. However, existing work still imposes structural assumptions on the unavailability patterns, impeding their applicability in challenging scenarios wherein the unavailability patterns are beyond the control of the parameter server. Moreover, in harsh environments like battlefields, adversaries can selectively and adaptively silence specific clients. In this paper, we relax the structural assumptions and consider adversarial client unavailability. To quantify the degrees of client unavailability, we use the notion of -adversary dropout fraction. We show that simple variants of FedAvg or FedProx, albeit completely agnostic to , converge to an estimation error on the order of for non-convex global objectives and for strongly convex global objectives, where is a heterogeneity parameter and is the noise level. Conversely, we prove that any algorithm has to suffer an estimation error of at least and for non-convex global objectives and -strongly convex global objectives. Furthermore, the convergence speeds of the FedAvg or FedProx variants are for non-convex objectives and for strongly-convex objectives, both of which are the best possible for any first-order method that only has access to noisy gradients.
Paper Structure (37 sections, 8 theorems, 98 equations, 13 figures, 1 table)

This paper contains 37 sections, 8 theorems, 98 equations, 13 figures, 1 table.

Key Result

Theorem 1.1

Let $\sigma$ be the average noise level of the stochastic gradients. For $\sqrt{\epsilon} B \le 0.1,$ where $\sup_{F_1, \cdots, F_M}$ is taken over all local objectives that collectively satisfy the $(B,G)$-heterogeneity condition, ${\mathcal{A}}$ is all adversarial client unavailability that is subject to $\epsilon$-adversary dropout fraction, and $\inf_{\widehat{\theta}}$ is taken over all algor

Figures (13)

  • Figure 1: Populations generated from Dirichlet distribution $\left( \alpha=0.1 \right)$ with different number of clients. Each row corresponds to the empirical distribution of local data in terms of classes. The colors correspond to data with different class labels.
  • Figure 2: CIFAR-10 results with Dirichlet parameter $\alpha=0.1$ and dropout fraction $\epsilon =0.8$ on adversarial client unavailability scheme in Section \ref{['subsec: adversarial client unavailability']}.
  • Figure 3: Natural language processing task with dropout fraction $\epsilon=0.7$ on a different adversarial client unavailability scheme, where the adversary inspects each client's local gradient improvement and removes clients of the greatest improvements subject to Assumption \ref{['ass:adversarial']}. Details can be found in Section \ref{['subsec: NLP']}.
  • Figure 4: Synthetic datasets: clients' local data volume histogram.
  • Figure 5: Synthetic datasets: comparisons with baselines with dropout fraction $\epsilon =0.9$ on adversarial client unavailability scheme in Section \ref{['subsec: adversarial client unavailability']}.
  • ...and 8 more figures

Theorems & Definitions (16)

  • Theorem 1.1: Informal
  • Theorem 4.3
  • Corollary 4.4
  • Theorem 4.5
  • Remark 4.6: Convex objective functions
  • Lemma 5.1
  • Lemma 5.2
  • Theorem 6.1
  • Remark 6.2: The impact of dissimilarity parameter $B$
  • Remark 6.3: Convergence rate in $T$
  • ...and 6 more