FedFa: A Fully Asynchronous Training Paradigm for Federated Learning

Haotian Xu; Zhaorui Zhang; Sheng Di; Benben Liu; Khalid Ayed Alharthi; Jiannong Cao

FedFa: A Fully Asynchronous Training Paradigm for Federated Learning

Haotian Xu, Zhaorui Zhang, Sheng Di, Benben Liu, Khalid Ayed Alharthi, Jiannong Cao

TL;DR

The paper tackles the wall-clock inefficiency of standard Federated Learning caused by barrier synchronization across heterogeneous devices. It introduces FedFa, a fully asynchronous training paradigm that maintains a server-side sliding window to merge multiple historical updates, thereby eliminating waiting times while preserving convergence. A convergence analysis under standard assumptions shows FedFa attains a convergence rate comparable to FedBuff, and two practical variants, FedFa-Param and FedFa-Delta, are proposed. Empirical results on CIFAR-10 and NLP tasks demonstrate substantial improvements in wall-clock time (up to sixfold) and reductions in communication rounds (up to 1.9x) with maintained accuracy in both IID and Non-IID settings, and the method remains compatible with secure aggregation.

Abstract

Federated learning has been identified as an efficient decentralized training paradigm for scaling the machine learning model training on a large number of devices while guaranteeing the data privacy of the trainers. FedAvg has become a foundational parameter update strategy for federated learning, which has been promising to eliminate the effect of the heterogeneous data across clients and guarantee convergence. However, the synchronization parameter update barriers for each communication round during the training significant time on waiting, slowing down the training procedure. Therefore, recent state-of-the-art solutions propose using semi-asynchronous approaches to mitigate the waiting time cost with guaranteed convergence. Nevertheless, emerging semi-asynchronous approaches are unable to eliminate the waiting time completely. We propose a full asynchronous training paradigm, called FedFa, which can guarantee model convergence and eliminate the waiting time completely for federated learning by using a few buffered results on the server for parameter updating. Further, we provide theoretical proof of the convergence rate for our proposed FedFa. Extensive experimental results indicate our approach effectively improves the training performance of federated learning by up to 6x and 4x speedup compared to the state-of-the-art synchronous and semi-asynchronous strategies while retaining high accuracy in both IID and Non-IID scenarios.

FedFa: A Fully Asynchronous Training Paradigm for Federated Learning

TL;DR

Abstract

Paper Structure (16 sections, 23 equations, 6 figures, 6 tables, 2 algorithms)

This paper contains 16 sections, 23 equations, 6 figures, 6 tables, 2 algorithms.

Introduction
Background, Related Work and Motivations
Synchronous Federated Learning
Semi-asynchronous and Asynchronous FL
Security and Privacy Protection
FedFa: Fully Asynchronous Federated Average for Federated Learning
The Design of FedFa
Gradient OR Parameter Transmission
Convergence Analysis of FedFa
Problem Formulation
The Proof for Convergence Rate of FedFa
Performance Evaluation and Analysis
Prototype Implementation
Evaluation Methodology
Results and Analysis
...and 1 more sections

Figures (6)

Figure 1: The wall clock time comparison.
Figure 2: Data distribution where a larger circle indicates a larger dataset size for each label 0-9 of CIFAR10.
Figure 3: Performance comparison of different training strategies on both IID and Non-IID data settings regarding the wall clock time.
Figure 4: Performance comparison of different training strategies on both IID and Non-IID data settings regarding communication round.
Figure 5: The comparison of different buffer sizes $K$.
...and 1 more figures

FedFa: A Fully Asynchronous Training Paradigm for Federated Learning

TL;DR

Abstract

FedFa: A Fully Asynchronous Training Paradigm for Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)