Table of Contents
Fetching ...

SEAFL: Enhancing Efficiency in Semi-Asynchronous Federated Learning through Adaptive Aggregation and Selective Training

Md Sirajul Islam, Sanjeev Panta, Fei Xu, Xu Yuan, Li Chen, Nian-Feng Tzeng

TL;DR

SEAFL addresses stragglers and stale-model issues in federated learning by introducing an adaptive, staleness-aware aggregation that weights local updates based on their freshness and similarity to the current global model. The method is extended with SEAFL^2, which enables partial training on slow devices to cut waiting times while maintaining convergence guarantees, and is supported by a formal convergence analysis (Theorem 1) and extensive experiments on EMNIST, CIFAR-10, and CINIC-10 showing up to ~22% reduction in wall-clock time to target accuracy. The approach achieves faster, more reliable convergence in heterogeneous, cross-device FL by balancing participation and update quality. This has practical implications for deploying FL in real-world, device-heterogeneous environments where communication and computation are uneven.

Abstract

Federated Learning (FL) is a promising distributed machine learning framework that allows collaborative learning of a global model across decentralized devices without uploading their local data. However, in real-world FL scenarios, the conventional synchronous FL mechanism suffers from inefficient training caused by slow-speed devices, commonly known as stragglers, especially in heterogeneous communication environments. Though asynchronous FL effectively tackles the efficiency challenge, it induces substantial system overheads and model degradation. Striking for a balance, semi-asynchronous FL has gained increasing attention, while still suffering from the open challenge of stale models, where newly arrived updates are calculated based on outdated weights that easily hurt the convergence of the global model. In this paper, we present {\em SEAFL}, a novel FL framework designed to mitigate both the straggler and the stale model challenges in semi-asynchronous FL. {\em SEAFL} dynamically assigns weights to uploaded models during aggregation based on their staleness and importance to the current global model. We theoretically analyze the convergence rate of {\em SEAFL} and further enhance the training efficiency with an extended variant that allows partial training on slower devices, enabling them to contribute to global aggregation while reducing excessive waiting times. We evaluate the effectiveness of {\em SEAFL} through extensive experiments on three benchmark datasets. The experimental results demonstrate that {\em SEAFL} outperforms its closest counterpart by up to $\sim$22\% in terms of the wall-clock training time required to achieve target accuracy.

SEAFL: Enhancing Efficiency in Semi-Asynchronous Federated Learning through Adaptive Aggregation and Selective Training

TL;DR

SEAFL addresses stragglers and stale-model issues in federated learning by introducing an adaptive, staleness-aware aggregation that weights local updates based on their freshness and similarity to the current global model. The method is extended with SEAFL^2, which enables partial training on slow devices to cut waiting times while maintaining convergence guarantees, and is supported by a formal convergence analysis (Theorem 1) and extensive experiments on EMNIST, CIFAR-10, and CINIC-10 showing up to ~22% reduction in wall-clock time to target accuracy. The approach achieves faster, more reliable convergence in heterogeneous, cross-device FL by balancing participation and update quality. This has practical implications for deploying FL in real-world, device-heterogeneous environments where communication and computation are uneven.

Abstract

Federated Learning (FL) is a promising distributed machine learning framework that allows collaborative learning of a global model across decentralized devices without uploading their local data. However, in real-world FL scenarios, the conventional synchronous FL mechanism suffers from inefficient training caused by slow-speed devices, commonly known as stragglers, especially in heterogeneous communication environments. Though asynchronous FL effectively tackles the efficiency challenge, it induces substantial system overheads and model degradation. Striking for a balance, semi-asynchronous FL has gained increasing attention, while still suffering from the open challenge of stale models, where newly arrived updates are calculated based on outdated weights that easily hurt the convergence of the global model. In this paper, we present {\em SEAFL}, a novel FL framework designed to mitigate both the straggler and the stale model challenges in semi-asynchronous FL. {\em SEAFL} dynamically assigns weights to uploaded models during aggregation based on their staleness and importance to the current global model. We theoretically analyze the convergence rate of {\em SEAFL} and further enhance the training efficiency with an extended variant that allows partial training on slower devices, enabling them to contribute to global aggregation while reducing excessive waiting times. We evaluate the effectiveness of {\em SEAFL} through extensive experiments on three benchmark datasets. The experimental results demonstrate that {\em SEAFL} outperforms its closest counterpart by up to 22\% in terms of the wall-clock training time required to achieve target accuracy.

Paper Structure

This paper contains 14 sections, 12 equations, 6 figures, 1 table, 2 algorithms.

Figures (6)

  • Figure 1: The working process of synchronous, asynchronous, and semi-asynchronous FL algorithms.
  • Figure 2: Illustration of the impacts of buffer size, staleness limit, and importance of local updates on asynchronous FL, where ${\gamma}_t$ indicates staleness factors, and $s_t$ denotes the importance of updates.
  • Figure 3: Left: The traditional AsyncFL architecture where the server initiates aggregation upon receiving the required number of local updates. Right: The proposed SEAFL$^2$ allows partial training on slower devices, enabling them to contribute to global aggregation. The server will notify slower devices to send their updates immediately after exceeding the staleness limit.
  • Figure 4: Elapsed wall-clock time required to reach target accuracy for different combinations of $\alpha$ and $\mu$.
  • Figure 5: Elapsed wall-clock time required to reach target accuracy for SEAFL (without partial training), FedBuff, FedAsync, and FedAvg. SEAFL converges faster to reach target accuracy and consistently outperforms other baselines.
  • ...and 1 more figures