On the Convergence of Federated Learning Algorithms without Data Similarity

Ali Beikmohammadi; Sarit Khirirat; Sindri Magnússon

On the Convergence of Federated Learning Algorithms without Data Similarity

Ali Beikmohammadi, Sarit Khirirat, Sindri Magnússon

TL;DR

This paper presents a novel and unified framework for analyzing the convergence of federated learning algorithms without the need for data similarity conditions, and derives precise expressions for three widely used step size schedules: fixed, diminishing, and step-decay step sizes, which are independent of data similarity conditions.

Abstract

Data similarity assumptions have traditionally been relied upon to understand the convergence behaviors of federated learning methods. Unfortunately, this approach often demands fine-tuning step sizes based on the level of data similarity. When data similarity is low, these small step sizes result in an unacceptably slow convergence speed for federated methods. In this paper, we present a novel and unified framework for analyzing the convergence of federated learning algorithms without the need for data similarity conditions. Our analysis centers on an inequality that captures the influence of step sizes on algorithmic convergence performance. By applying our theorems to well-known federated algorithms, we derive precise expressions for three widely used step size schedules: fixed, diminishing, and step-decay step sizes, which are independent of data similarity conditions. Finally, we conduct comprehensive evaluations of the performance of these federated learning algorithms, employing the proposed step size strategies to train deep neural network models on benchmark datasets under varying data similarity conditions. Our findings demonstrate significant improvements in convergence speed and overall performance, marking a substantial advancement in federated learning research.

On the Convergence of Federated Learning Algorithms without Data Similarity

TL;DR

Abstract

Paper Structure (32 sections, 11 theorems, 142 equations, 9 figures, 1 table, 4 algorithms)

This paper contains 32 sections, 11 theorems, 142 equations, 9 figures, 1 table, 4 algorithms.

Introduction
Contributions
Notations
Prior Works
Data Similarity Assumptions
Communication-efficient Federated Optimization
Step Size Schedules for Stochastic Optimization
Main Convergence Theorems
Applications in Federated Learning
Full-precision Federated Learning Algorithms
FedAvg
FedProx
Error-feedback Federated Learning Algorithms
Error-feedback FedAvg
Error-feedback FedProx
...and 17 more sections

Key Result

Theorem 1

Consider the system eqn:mainIneq. If $\gamma_k = \gamma = c/\sqrt{K}$ for $c>0$ and $K\in\mathbb{N}$, then

Figures (9)

Figure 1: Visual workflow of our analysis.
Figure 2: Visual workflow of the full-precision federated learning algorithms.
Figure 3: Visual workflow of the error-feedback federated learning algorithms.
Figure 4: Performance of FedAvg, error-feedback FedAvg, FedProx, and error-feedback FedProx with the fixed step size in (left plots -) training loss and (right plots -) test accuracy on MNIST dataset considering three different partitioned data among the workers.
Figure 5: Performance of FedAvg, error-feedback FedAvg, FedProx, and error-feedback FedProx with the fixed step size in (left plots -) training loss and (right plots -) test accuracy on FashionMNIST dataset considering three different partitioned data among the workers.
...and 4 more figures

Theorems & Definitions (16)

Theorem 1: Fixed step sizes
Theorem 2: Diminishing step sizes
Theorem 3: Step-decay step sizes
proof
Proposition 1: FedAvg
Proposition 2: FedProx
Proposition 3: Error-feedback FedAvg
Proposition 4: Error-feedback FedProx
Lemma 1
proof
...and 6 more

On the Convergence of Federated Learning Algorithms without Data Similarity

TL;DR

Abstract

On the Convergence of Federated Learning Algorithms without Data Similarity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (16)