Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated Learning
Xiaorui Jiang, Yu Gao, Hengwei Xu, Qi Zhang, Yong Liao, Pengyuan Zhou
TL;DR
This paper tackles asynchronous federated learning under Non-IID data and gradient staleness by proposing FedHist, a framework that leverages a server-side gradient history and the notion of knowledge rumination to enable history-based, multi-dimensional client weighting and robust gradient fusion. It combines three mechanisms—Enhancement of Gradient Stability (EGS), History-Aware Aggregation (HAA), and Intelligent ℓ2-Norm Amplification (INA)—to stabilize local gradients, reuse informative past gradients via a gradient buffer, and preserve gradient energy during aggregation. Through extensive experiments on FMNIST, CIFAR-10, and CIFAR-100, FedHist consistently outperforms state-of-the-art baselines in convergence speed, test accuracy, and fairness, especially in highly heterogeneous and stale settings. The approach preserves privacy and does not require any prior knowledge about clients, making it practical for real-world heterogeneous FL deployments.
Abstract
Federated Learning (FL) allows several clients to cooperatively train machine learning models without disclosing the raw data. In practical applications, asynchronous FL (AFL) can address the straggler effect compared to synchronous FL. However, Non-IID data and stale models pose significant challenges to AFL, as they can diminish the practicality of the global model and even lead to training failures. In this work, we propose a novel AFL framework called Federated Historical Learning (FedHist), which effectively addresses the challenges posed by both Non-IID data and gradient staleness based on the concept of knowledge rumination. FedHist enhances the stability of local gradients by performing weighted fusion with historical global gradients cached on the server. Relying on hindsight, it assigns aggregation weights to each participant in a multi-dimensional manner during each communication round. To further enhance the efficiency and stability of the training process, we introduce an intelligent $\ell_2$-norm amplification scheme, which dynamically regulates the learning progress based on the $\ell_2$-norms of the submitted gradients. Extensive experiments indicate FedHist outperforms state-of-the-art methods in terms of convergence performance and test accuracy.
