Decentralized Federated Learning with Asynchronous Parameter Sharing for Large-scale IoT Networks
Haihui Xie, Minghua Xia, Peiran Wu, Shuai Wang, Kaibin Huang
TL;DR
This work addresses scalable, privacy‑preserving learning in resource‑constrained IoT networks by introducing a decentralized federated learning framework with asynchronous parameter sharing. It leverages a multi‑branch loss structure and the Shapley‑Folkman lemma to render the optimization effectively convex as the node count grows, enabling convergence to a global stationary point with higher probability than centralized FL. A gradient‑descent algorithm with asynchronous sharing is analyzed under a derived convergence bound that incorporates transmission delays, and a wireless resource allocation scheme jointly optimizes node scheduling and bandwidth to minimize transmission delay. Simulations on MNIST and CIFAR demonstrate faster convergence and improved accuracy over benchmark schemes, validating the practical benefits of the proposed approach for large‑scale IoT deployments. The results show that carefully balancing scheduling (via SINR thresholds) and bandwidth allocation can substantially reduce training time while maintaining high learning performance in distributed, resource‑limited networks.
Abstract
Federated learning (FL) enables wireless terminals to collaboratively learn a shared parameter model while keeping all the training data on devices per se. Parameter sharing consists of synchronous and asynchronous ways: the former transmits parameters as blocks or frames and waits until all transmissions finish, whereas the latter provides messages about the status of pending and failed parameter transmission requests. Whatever synchronous or asynchronous parameter sharing is applied, the learning model shall adapt to distinct network architectures as an improper learning model will deteriorate learning performance and, even worse, lead to model divergence for the asynchronous transmission in resource-limited large-scale Internet-of-Things (IoT) networks. This paper proposes a decentralized learning model and develops an asynchronous parameter-sharing algorithm for resource-limited distributed IoT networks. This decentralized learning model approaches a convex function as the number of nodes increases, and its learning process converges to a global stationary point with a higher probability than the centralized FL model. Moreover, by jointly accounting for the convergence bound of federated learning and the transmission delay of wireless communications, we develop a node scheduling and bandwidth allocation algorithm to minimize the transmission delay. Extensive simulation results corroborate the effectiveness of the distributed algorithm in terms of fast learning model convergence and low transmission delay.
