A Unified Analysis of Federated Learning with Arbitrary Client Participation

Shiqiang Wang; Mingyue Ji

A Unified Analysis of Federated Learning with Arbitrary Client Participation

Shiqiang Wang, Mingyue Ji

TL;DR

This work tackles federated learning with arbitrary client participation by introducing a generalized FedAvg that amplifies accumulated updates every $P$ rounds. It derives a unified convergence bound that captures participation effects via a single term $\tilde{\delta}^2(P)$ and decomposes gradient divergence into $\tilde{\beta}^2$ and $\tilde{\nu}^2$, linking rate behavior to regularized and stochastic participation patterns. The authors show that regularized participation can achieve zero-variance bounds and, with appropriate $(\gamma,\eta)$, rates that match the centralized SGD lower bound, while stochastic participation can reach state-of-the-art FedAvg rates. Empirical results on non-IID data and periodic availability demonstrate that amplification improves convergence over standard FedAvg and waiting strategies, offering practical guidance for FL systems facing intermittent client participation.

Abstract

Federated learning (FL) faces challenges of intermittent client availability and computation/communication efficiency. As a result, only a small subset of clients can participate in FL at a given time. It is important to understand how partial client participation affects convergence, but most existing works have either considered idealized participation patterns or obtained results with non-zero optimality error for generic patterns. In this paper, we provide a unified convergence analysis for FL with arbitrary client participation. We first introduce a generalized version of federated averaging (FedAvg) that amplifies parameter updates at an interval of multiple FL rounds. Then, we present a novel analysis that captures the effect of client participation in a single term. By analyzing this term, we obtain convergence upper bounds for a wide range of participation patterns, including both non-stochastic and stochastic cases, which match either the lower bound of stochastic gradient descent (SGD) or the state-of-the-art results in specific settings. We also discuss various insights, recommendations, and experimental results.

A Unified Analysis of Federated Learning with Arbitrary Client Participation

TL;DR

This work tackles federated learning with arbitrary client participation by introducing a generalized FedAvg that amplifies accumulated updates every

rounds. It derives a unified convergence bound that captures participation effects via a single term

and decomposes gradient divergence into

and

, linking rate behavior to regularized and stochastic participation patterns. The authors show that regularized participation can achieve zero-variance bounds and, with appropriate

, rates that match the centralized SGD lower bound, while stochastic participation can reach state-of-the-art FedAvg rates. Empirical results on non-IID data and periodic availability demonstrate that amplification improves convergence over standard FedAvg and waiting strategies, offering practical guidance for FL systems facing intermittent client participation.

Abstract

Paper Structure (23 sections, 13 theorems, 42 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 13 theorems, 42 equations, 7 figures, 3 tables, 1 algorithm.

Introduction
Generalized FedAvg with Amplified Updates
Convergence Analysis and Main Result
Interpreting and Applying the Unified Framework
Decomposition of Divergence
Effect of Partial Participation
Discussions and Insights
Experiments
Conclusion
Additional Related Works
Additional Discussions
Why Does Amplification Help?
Relating Different Participation Patterns to Practical FL Scenarios
Proofs
Preliminaries
...and 8 more sections

Key Result

Theorem 3.1

When Assumptions assumption:Lipschitz, assumption:gradientNoise, and assumption:gradientDivergenceAlternative hold, $\gamma \leq \frac{1}{12 LIP }$, $\gamma \eta \leq \frac{1}{2LIP }$, and $P \leq \frac{T}{2}$, we have

Figures (7)

Figure 1: Results for different approaches with periodically connected clients ($P=500$).
Figure 2: Algorithm \ref{['alg:main-alg']} with amplification and different $P$, with periodically connected clients.
Figure B.1: Motivating example: different local learning rates $\gamma$ without amplification (i.e., $\eta=1$). The trajectory from $\mathbf{x}_0$ to $\mathbf{x}_{15}$ shows how the model parameter changes from round $t=0$ to round $t=15$.
Figure B.2: Motivating example: fixed local learning rate $\gamma=0.05$ with different amplification factors $\eta$. The trajectory from $\mathbf{x}_0$ to $\mathbf{x}_{15}$ shows how the model parameter changes from round $t=0$ to round $t=15$. The segments in cyan color shows the change in parameter $\mathbf{x}$ due to amplification, while the segments in black color shows the parameter change due to regular SGD operation.
Figure D.1: Comparison between independent and regularized participation in the case of always available clients.
...and 2 more figures

Theorems & Definitions (16)

Theorem 3.1
proof : Proof Sketch
Corollary 3.2
Corollary 3.3
Proposition 4.1
Proposition 4.2: Regularized participation
Corollary 4.3: Regularized participation
Proposition 4.4: Ergodic participation
Lemma 4.5: [billingsley1995probability, Thm. 27.4]
Proposition 4.6: Stationary and strongly mixing participation
...and 6 more

A Unified Analysis of Federated Learning with Arbitrary Client Participation

TL;DR

Abstract

A Unified Analysis of Federated Learning with Arbitrary Client Participation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (16)