Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated Learning

Xiaorui Jiang; Yu Gao; Hengwei Xu; Qi Zhang; Yong Liao; Pengyuan Zhou

Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated Learning

Xiaorui Jiang, Yu Gao, Hengwei Xu, Qi Zhang, Yong Liao, Pengyuan Zhou

TL;DR

This paper tackles asynchronous federated learning under Non-IID data and gradient staleness by proposing FedHist, a framework that leverages a server-side gradient history and the notion of knowledge rumination to enable history-based, multi-dimensional client weighting and robust gradient fusion. It combines three mechanisms—Enhancement of Gradient Stability (EGS), History-Aware Aggregation (HAA), and Intelligent ℓ2-Norm Amplification (INA)—to stabilize local gradients, reuse informative past gradients via a gradient buffer, and preserve gradient energy during aggregation. Through extensive experiments on FMNIST, CIFAR-10, and CIFAR-100, FedHist consistently outperforms state-of-the-art baselines in convergence speed, test accuracy, and fairness, especially in highly heterogeneous and stale settings. The approach preserves privacy and does not require any prior knowledge about clients, making it practical for real-world heterogeneous FL deployments.

Abstract

Federated Learning (FL) allows several clients to cooperatively train machine learning models without disclosing the raw data. In practical applications, asynchronous FL (AFL) can address the straggler effect compared to synchronous FL. However, Non-IID data and stale models pose significant challenges to AFL, as they can diminish the practicality of the global model and even lead to training failures. In this work, we propose a novel AFL framework called Federated Historical Learning (FedHist), which effectively addresses the challenges posed by both Non-IID data and gradient staleness based on the concept of knowledge rumination. FedHist enhances the stability of local gradients by performing weighted fusion with historical global gradients cached on the server. Relying on hindsight, it assigns aggregation weights to each participant in a multi-dimensional manner during each communication round. To further enhance the efficiency and stability of the training process, we introduce an intelligent $\ell_2$-norm amplification scheme, which dynamically regulates the learning progress based on the $\ell_2$-norms of the submitted gradients. Extensive experiments indicate FedHist outperforms state-of-the-art methods in terms of convergence performance and test accuracy.

Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated Learning

TL;DR

Abstract

-norm amplification scheme, which dynamically regulates the learning progress based on the

-norms of the submitted gradients. Extensive experiments indicate FedHist outperforms state-of-the-art methods in terms of convergence performance and test accuracy.

Paper Structure (20 sections, 14 equations, 5 figures, 2 tables)

This paper contains 20 sections, 14 equations, 5 figures, 2 tables.

Introduction
Methodology
Problem Statement
Enhancement of Gradient Stability (EGS)
History-Aware Aggregation (HAA)
Relatively fresh updates
Predicted unbiased gradients
Client utility evaluation
Intelligent $\ell_2$-Norm Amplification (INA)
Experiments
Experimental Settings
Performance Comparison
Data heterogeneity
Degrees of staleness
Convergence speed
...and 5 more sections

Figures (5)

Figure 1: Problem illustration of K-async FL. (a) Non-IID data introduces biases among local gradients. This heterogeneity affects the convergence of the global model. (b) Stale gradients are often detrimental to the global model, causing update directions to deviate from local optima.
Figure 2: The overall framework of FedHist. It comprises three components: EGS, HAA, and INA.
Figure 3: Performance of FedHist and baselines during the training process on CIFAR-10 with $\beta$ = 0.3, 1.0 and $\infty$.
Figure 4: Performance with different missing components on CIFAR-10.
Figure 5: Performance with specific labels occurring only on a few slow clients.

Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated Learning

TL;DR

Abstract

Knowledge Rumination for Client Utility Evaluation in Heterogeneous Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)