A Joint Approach to Local Updating and Gradient Compression for Efficient Asynchronous Federated Learning
Jiajun Song, Jiajun Luo, Rongwei Lu, Shuzhao Xie, Bin Chen, Zhi Wang
TL;DR
This work tackles AFL under device heterogeneity and limited bandwidth by addressing model staleness through a joint optimization of local updating frequency and gradient compression. The authors derive a convergence upper bound that identifies a key factor $\phi$ depending on per-device $k_i$ and $\delta_i$, and they propose FedLuck to adapt these parameters per device by minimizing $\phi$ using locally measured times $\alpha_i$ and $\beta_i$. Empirical evaluation across image and speech tasks demonstrates that FedLuck reduces communication by $56\%$ and training time by $\approx 55\%$ on average while maintaining competitive accuracy, even in Non-IID settings. This approach advances practical AFL by integrating computation-communication trade-offs into a unified adaptive framework, enabling more efficient deployment on heterogeneous devices and networks.
Abstract
Asynchronous Federated Learning (AFL) confronts inherent challenges arising from the heterogeneity of devices (e.g., their computation capacities) and low-bandwidth environments, both potentially causing stale model updates (e.g., local gradients) for global aggregation. Traditional approaches mitigating the staleness of updates typically focus on either adjusting the local updating or gradient compression, but not both. Recognizing this gap, we introduce a novel approach that synergizes local updating with gradient compression. Our research begins by examining the interplay between local updating frequency and gradient compression rate, and their collective impact on convergence speed. The theoretical upper bound shows that the local updating frequency and gradient compression rate of each device are jointly determined by its computing power, communication capabilities and other factors. Building on this foundation, we propose an AFL framework called FedLuck that adaptively optimizes both local update frequency and gradient compression rates. Experiments on image classification and speech recognization show that FedLuck reduces communication consumption by 56% and training time by 55% on average, achieving competitive performance in heterogeneous and low-bandwidth scenarios compared to the baselines.
