Accelerating Hybrid Federated Learning Convergence under Partial Participation
Jieming Bian, Lei Wang, Kun Yang, Cong Shen, Jie Xu
TL;DR
This work addresses convergence in hybrid federated learning where the server holds a small, representative dataset and clients participate partially with non-IID data. It first analyzes CLG-SGD under non-IID and partial participation, revealing that server training helps but partial participation remains a bottleneck. The authors then introduce FedCLG, with two variants (FedCLG-C and FedCLG-S) that use server-side gradients to correct variance either during client updates or server aggregation, respectively. Theoretical convergence results show accelerated rates when leveraging server data, and experiments on MNIST, CIFAR-10, and CIFAR-100 demonstrate that FedCLG outperforms state-of-the-art baselines under realistic non-IID, partially participating settings. Overall, FedCLG provides a practical and theoretically grounded approach to speeding up hybrid FL by effectively exploiting server data and variance correction mechanisms.
Abstract
Over the past few years, Federated Learning (FL) has become a popular distributed machine learning paradigm. FL involves a group of clients with decentralized data who collaborate to learn a common model under the coordination of a centralized server, with the goal of protecting clients' privacy by ensuring that local datasets never leave the clients and that the server only performs model aggregation. However, in realistic scenarios, the server may be able to collect a small amount of data that approximately mimics the population distribution and has stronger computational ability to perform the learning process. To address this, we focus on the hybrid FL framework in this paper. While previous hybrid FL work has shown that the alternative training of clients and server can increase convergence speed, it has focused on the scenario where clients fully participate and ignores the negative effect of partial participation. In this paper, we provide theoretical analysis of hybrid FL under clients' partial participation to validate that partial participation is the key constraint on convergence speed. We then propose a new algorithm called FedCLG, which investigates the two-fold role of the server in hybrid FL. Firstly, the server needs to process the training steps using its small amount of local datasets. Secondly, the server's calculated gradient needs to guide the participated clients' training and the server's aggregation. We validate our theoretical findings through numerical experiments, which show that our proposed method FedCLG outperforms state-of-the-art methods.
