A Resource-Adaptive Approach for Federated Learning under Resource-Constrained Environments
Ruirui Zhang, Xingze Wu, Yifei Zou, Zhenzhen Xie, Peng Li, Xiuzhen Cheng, Dongxiao Yu
TL;DR
This work tackles federated learning among heterogeneous, resource-constrained clients by introducing Fed-RAA, a resource-adaptive asynchronous FL that assigns submodel fragments to clients based on their compute and communication capabilities. The method trains submodel fragments locally and updates the global model asynchronously, with a proximal regularization term to stabilize local optimization. A theoretical convergence analysis demonstrates that, under standard smoothness and weak convexity assumptions and bounded staleness, Fed-RAA converges to a critical point with explicit rate bounds; an online greedy scheduler Gre-RAA further guarantees near-optimal delay bounds and fairness. Empirical evaluation on MNIST, CIFAR-10, and CIFAR-100 shows Fed-RAA achieves faster wall-clock convergence than several baselines, with ablation studies confirming the value of asynchronous updates and online submodel assignment. The results underscore the practical potential of resource-adaptive fragmentation for rapid, scalable FL in edge environments, with future work extending to large-model architectures and mixture-of-experts strategies.
Abstract
The paper studies a fundamental federated learning (FL) problem involving multiple clients with heterogeneous constrained resources. Compared with the numerous training parameters, the computing and communication resources of clients are insufficient for fast local training and real-time knowledge sharing. Besides, training on clients with heterogeneous resources may result in the straggler problem. To address these issues, we propose Fed-RAA: a Resource-Adaptive Asynchronous Federated learning algorithm. Different from vanilla FL methods, where all parameters are trained by each participating client regardless of resource diversity, Fed-RAA adaptively allocates fragments of the global model to clients based on their computing and communication capabilities. Each client then individually trains its assigned model fragment and asynchronously uploads the updated result. Theoretical analysis confirms the convergence of our approach. Additionally, we design an online greedy-based algorithm for fragment allocation in Fed-RAA, achieving fairness comparable to an offline strategy. We present numerical results on MNIST, CIFAR-10, and CIFAR-100, along with necessary comparisons and ablation studies, demonstrating the advantages of our work. To the best of our knowledge, this paper represents the first resource-adaptive asynchronous method for fragment-based FL with guaranteed theoretical convergence.
