Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks

Chung-Hsuan Hu; Zheng Chen; Erik G. Larsson

Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks

Chung-Hsuan Hu, Zheng Chen, Erik G. Larsson

TL;DR

This work addresses the straggler and communication bottleneck problem in federated learning over wireless networks by introducing an asynchronous FL framework with periodic aggregation. It combines channel-aware data-importance scheduling and age-aware aggregation to reduce bias and variance in updates, supported by theoretical convergence analysis and MNIST-based simulations showing improved convergence over synchronous FedAvg and fully asynchronous FedAsync, especially under non-iid data. The results provide practical guidelines for wireless FL, including resource allocation, compression strategies, and update-to-update weighting, to achieve faster, more reliable learning with heterogeneous devices. Overall, the design demonstrates that carefully balancing data representativeness, channel quality, and update freshness yields robust performance in resource-constrained FL systems.

Abstract

Federated Learning (FL) is a collaborative machine learning (ML) framework that combines on-device training and server-based aggregation to train a common ML model among distributed agents. In this work, we propose an asynchronous FL design with periodic aggregation to tackle the straggler issue in FL systems. Considering limited wireless communication resources, we investigate the effect of different scheduling policies and aggregation designs on the convergence performance. Driven by the importance of reducing the bias and variance of the aggregated model updates, we propose a scheduling policy that jointly considers the channel quality and training data representation of user devices. The effectiveness of our channel-aware data-importance-based scheduling policy, compared with state-of-the-art methods proposed for synchronous FL, is validated through simulations. Moreover, we show that an ``age-aware'' aggregation weighting design can significantly improve the learning performance in an asynchronous FL setting.

Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks

TL;DR

Abstract

Paper Structure (27 sections, 4 theorems, 67 equations, 10 figures)

This paper contains 27 sections, 4 theorems, 67 equations, 10 figures.

Introduction
System Model
FedAvg with Synchronous Training and Aggregation
Asynchronous FL with Periodic Aggregation
Physical Layer (PHY) Model
Communication resource allocation
Sparsification and Quantization
Motivation of Scheduling and Aggregation Design
Training data distribution
Data compression
Asynchronous model updates
Scheduling and Aggregation Design for Asynchronous FL
Channel-aware Data-importance-based Scheduling
Age-aware Model Aggregation
Convergence Analysis
...and 12 more sections

Key Result

Theorem 1

Under Assumptions asump:lSmooth-assump:rmin, $E=1$, constant learning rate $\alpha(t)\triangleq\alpha<\frac{\mu r_{\min}}{d\left(2L^2+C_3\right)}$, and partial device participation such that $\Pi(\rho)=\cup_{j=0,...,a_{\lim}}\mathcal{M}_j(\rho)\subseteq\mathcal{N}, \rho=1,...,t$, it holds that where $C_3=8L^2\left[\left(1+\frac{d}{4\nu^2}\right)C_2+1\right]$ and The expectation is taken over the

Figures (10)

Figure 1: The FL process and information exchange between the server and the participating devices.
Figure 2: Conceptional difference between synchronous FL, fully asynchronous FL, and our proposed asynchronous FL with periodic aggregation. $\boldsymbol{\theta}(t)$ represents the model parameter vector in the $t$-th global iteration.
Figure 3: The relations between $\mathcal{N}$, $\mathcal{K}(t)$, $\Pi(t)$, and $\{\mathcal{M}_m(t)\}_0^{a_{\lim}}, a_{\lim}=3,$ at iteration $t$, where each colored block represents one device, and same-color devices have the same ALU. An example of normalization scaling for each nonempty $\mathcal{M}_m(t)$ is provided.
Figure 4: Test accuracy of the proposed scheme under different partial scheduling ratio, where $N=40$ and $n=50000$.
Figure 5: Impact of $\tilde{T}$ on test accuracy of the proposed asynchronous FL with periodic aggregation in i.i.d. and non-i.i.d. scenarios, where $N=40$, $R=0.2N$ and $n=300000$.
...and 5 more figures

Theorems & Definitions (9)

Definition 1
Definition 2
Theorem 1
Remark 1
Remark 2
Remark 3
Lemma 1
Lemma 2
Lemma 3

Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks

TL;DR

Abstract

Scheduling and Aggregation Design for Asynchronous Federated Learning over Wireless Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (9)