Table of Contents
Fetching ...

A Unified Analysis of Federated Learning with Arbitrary Client Participation

Shiqiang Wang, Mingyue Ji

TL;DR

This work tackles federated learning with arbitrary client participation by introducing a generalized FedAvg that amplifies accumulated updates every $P$ rounds. It derives a unified convergence bound that captures participation effects via a single term $\tilde{\delta}^2(P)$ and decomposes gradient divergence into $\tilde{\beta}^2$ and $\tilde{\nu}^2$, linking rate behavior to regularized and stochastic participation patterns. The authors show that regularized participation can achieve zero-variance bounds and, with appropriate $(\gamma,\eta)$, rates that match the centralized SGD lower bound, while stochastic participation can reach state-of-the-art FedAvg rates. Empirical results on non-IID data and periodic availability demonstrate that amplification improves convergence over standard FedAvg and waiting strategies, offering practical guidance for FL systems facing intermittent client participation.

Abstract

Federated learning (FL) faces challenges of intermittent client availability and computation/communication efficiency. As a result, only a small subset of clients can participate in FL at a given time. It is important to understand how partial client participation affects convergence, but most existing works have either considered idealized participation patterns or obtained results with non-zero optimality error for generic patterns. In this paper, we provide a unified convergence analysis for FL with arbitrary client participation. We first introduce a generalized version of federated averaging (FedAvg) that amplifies parameter updates at an interval of multiple FL rounds. Then, we present a novel analysis that captures the effect of client participation in a single term. By analyzing this term, we obtain convergence upper bounds for a wide range of participation patterns, including both non-stochastic and stochastic cases, which match either the lower bound of stochastic gradient descent (SGD) or the state-of-the-art results in specific settings. We also discuss various insights, recommendations, and experimental results.

A Unified Analysis of Federated Learning with Arbitrary Client Participation

TL;DR

This work tackles federated learning with arbitrary client participation by introducing a generalized FedAvg that amplifies accumulated updates every rounds. It derives a unified convergence bound that captures participation effects via a single term and decomposes gradient divergence into and , linking rate behavior to regularized and stochastic participation patterns. The authors show that regularized participation can achieve zero-variance bounds and, with appropriate , rates that match the centralized SGD lower bound, while stochastic participation can reach state-of-the-art FedAvg rates. Empirical results on non-IID data and periodic availability demonstrate that amplification improves convergence over standard FedAvg and waiting strategies, offering practical guidance for FL systems facing intermittent client participation.

Abstract

Federated learning (FL) faces challenges of intermittent client availability and computation/communication efficiency. As a result, only a small subset of clients can participate in FL at a given time. It is important to understand how partial client participation affects convergence, but most existing works have either considered idealized participation patterns or obtained results with non-zero optimality error for generic patterns. In this paper, we provide a unified convergence analysis for FL with arbitrary client participation. We first introduce a generalized version of federated averaging (FedAvg) that amplifies parameter updates at an interval of multiple FL rounds. Then, we present a novel analysis that captures the effect of client participation in a single term. By analyzing this term, we obtain convergence upper bounds for a wide range of participation patterns, including both non-stochastic and stochastic cases, which match either the lower bound of stochastic gradient descent (SGD) or the state-of-the-art results in specific settings. We also discuss various insights, recommendations, and experimental results.
Paper Structure (23 sections, 13 theorems, 42 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 13 theorems, 42 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Theorem 3.1

When Assumptions assumption:Lipschitz, assumption:gradientNoise, and assumption:gradientDivergenceAlternative hold, $\gamma \leq \frac{1}{12 LIP }$, $\gamma \eta \leq \frac{1}{2LIP }$, and $P \leq \frac{T}{2}$, we have

Figures (7)

  • Figure 1: Results for different approaches with periodically connected clients ($P=500$).
  • Figure 2: Algorithm \ref{['alg:main-alg']} with amplification and different $P$, with periodically connected clients.
  • Figure B.1: Motivating example: different local learning rates $\gamma$ without amplification (i.e., $\eta=1$). The trajectory from $\mathbf{x}_0$ to $\mathbf{x}_{15}$ shows how the model parameter changes from round $t=0$ to round $t=15$.
  • Figure B.2: Motivating example: fixed local learning rate $\gamma=0.05$ with different amplification factors $\eta$. The trajectory from $\mathbf{x}_0$ to $\mathbf{x}_{15}$ shows how the model parameter changes from round $t=0$ to round $t=15$. The segments in cyan color shows the change in parameter $\mathbf{x}$ due to amplification, while the segments in black color shows the parameter change due to regular SGD operation.
  • Figure D.1: Comparison between independent and regularized participation in the case of always available clients.
  • ...and 2 more figures

Theorems & Definitions (16)

  • Theorem 3.1
  • proof : Proof Sketch
  • Corollary 3.2
  • Corollary 3.3
  • Proposition 4.1
  • Proposition 4.2: Regularized participation
  • Corollary 4.3: Regularized participation
  • Proposition 4.4: Ergodic participation
  • Lemma 4.5: [billingsley1995probability, Thm. 27.4]
  • Proposition 4.6: Stationary and strongly mixing participation
  • ...and 6 more