MimiC: Combating Client Dropouts in Federated Learning by Mimicking Central Updates
Yuchang Sun, Yuyi Mao, Jun Zhang
TL;DR
This work addresses convergence challenges in cross-device FL caused by arbitrary client dropouts by showing FedAvg can fail to converge under decaying learning rates due to a bias between the aggregated update and the central gradient. It then introduces MimiC, a server-side correction that mimics the central update via history-based correction terms, ensuring bounded divergence and convergence to a stationary point under proper learning-rate schedules. Theoretical results demonstrate convergence under both deterministic and probabilistic dropout, while extensive experiments on FMNIST and CIFAR-10 show MimiC achieving superior accuracy and stability compared with FedAvg, FedProx, SCAFFOLD, and MIFA. The approach is practical (no extra client computation) and privacy-friendly, providing a substantial improvement for reliable FL in mobile edge environments.
Abstract
Federated learning (FL) is a promising framework for privacy-preserving collaborative learning, where model training tasks are distributed to clients and only the model updates need to be collected at a server. However, when being deployed at mobile edge networks, clients may have unpredictable availability and drop out of the training process, which hinders the convergence of FL. This paper tackles such a critical challenge. Specifically, we first investigate the convergence of the classical FedAvg algorithm with arbitrary client dropouts. We find that with the common choice of a decaying learning rate, FedAvg oscillates around a stationary point of the global loss function, which is caused by the divergence between the aggregated and desired central update. Motivated by this new observation, we then design a novel training algorithm named MimiC, where the server modifies each received model update based on the previous ones. The proposed modification of the received model updates mimics the imaginary central update irrespective of dropout clients. The theoretical analysis of MimiC shows that divergence between the aggregated and central update diminishes with proper learning rates, leading to its convergence. Simulation results further demonstrate that MimiC maintains stable convergence performance and learns better models than the baseline methods.
