Painless Federated Learning: An Interplay of Line-Search and Extrapolation

Geetika; Somya Tyagi; Bapi Chatterjee

Painless Federated Learning: An Interplay of Line-Search and Extrapolation

Geetika, Somya Tyagi, Bapi Chatterjee

TL;DR

The paper tackles slowdown in federated optimization caused by client heterogeneity and gradient noise by introducing FedSLS, which applies Armijo-style line search at clients, and FedExpSLS, which adds server LR extrapolation. The authors prove that FedSLS achieves deterministic convergence rates in expectation, including linear convergence for strongly convex objectives with partial participation, and that FedExpSLS retains these guarantees while offering empirical performance gains. They provide a thorough theoretical treatment under standard smoothness, convexity, and interpolation-type assumptions, and validate the methods with extensive experiments on diverse convex and non-convex tasks, showing state-of-the-art performance in many settings. The work demonstrates that stochastic Armijo line search can effectively bound client drift and accelerate federated learning, suggesting a practical path toward more robust FL algorithms in heterogeneous environments.

Abstract

The classical line search for learning rate (LR) tuning in the stochastic gradient descent (SGD) algorithm can tame the convergence slowdown due to data-sampling noise. In a federated setting, wherein the client heterogeneity introduces a slowdown to the global convergence, line search can be relevantly adapted. In this work, we show that a stochastic variant of line search tames the heterogeneity in federated optimization in addition to that due to client-local gradient noise. To this end, we introduce Federated Stochastic Line Search (FedSLS) algorithm and show that it achieves deterministic rates in expectation. Specifically, FedSLS offers linear convergence for strongly convex objectives even with partial client participation. Recently, the extrapolation of the server's LR has shown promise for improved empirical performance for federated learning. To benefit from extrapolation, we extend FedSLS to Federated Extrapolated Stochastic Line Search (FedExpSLS) and prove its convergence. Our extensive empirical results show that the proposed methods perform at par or better than the popular federated learning algorithms across many convex and non-convex problems.

Painless Federated Learning: An Interplay of Line-Search and Extrapolation

TL;DR

Abstract

Paper Structure (28 sections, 22 theorems, 137 equations, 5 figures, 2 tables, 5 algorithms)

This paper contains 28 sections, 22 theorems, 137 equations, 5 figures, 2 tables, 5 algorithms.

Introduction
Federated learning.
Related Work
Algorithm and Assumptions
Convergence Results
Deterministic rates for Sgd
Towards Deterministic Rates in Federated Learning
Armijo line search vs. bounded heterogeneity
Convergence of FedSLS
Convergence of FedExpSLS
Experiments and Numerical Results
Conclusion and Discussion
Armijo Line Search Algorithm
Discussion on Deterministic Learning Rate
Model Update Algorithms for Federated Learning
...and 13 more sections

Key Result

Theorem 1

Let the objective function for the $i$-th device $f_i$ be $L$-smooth and convex, the function estimates in Armijo line-search are $\kappa_f$ sufficiently accurate in expectation. For $\tilde{c}>\frac{1}{2}$, Sgd with Armijo Line search (eqn:armijo) achieves the convergence rate of deterministic grad where $\tilde{c}:=c-2\kappa_{f}\eta_{l_{\max}}$ and $\bar{w}_k=\frac{1}{K}\sum_{k=1}^{K}{w}_{k-1}$.

Figures (5)

Figure 1: Efficacy of line search.
Figure 2: Training loss v/s Communication Rounds
Figure 3: Test Accuracy v/s Communication Rounds
Figure 4: Average Line Search Steps vs Communication Rounds
Figure 5: CIFAR-10 experiments with varying $c$ values

Theorems & Definitions (41)

Definition 1: Armijo Condition
Definition 2: Sample-wise Interpolation
Definition 3: $\kappa^i_f$-accurate function
Remark 1
Theorem 1
Lemma 1
Remark 2
Theorem 2: $f_i$ are convex
Theorem 3: $f_i$ are strongly convex
Theorem 4: $f_i$ are non-convex
...and 31 more

Painless Federated Learning: An Interplay of Line-Search and Extrapolation

TL;DR

Abstract

Painless Federated Learning: An Interplay of Line-Search and Extrapolation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (41)