Tackling the Non-IID Issue in Heterogeneous Federated Learning by Gradient Harmonization
Xinyu Zhang, Weiyu Sun, Ying Chen
TL;DR
Non-IID data and device heterogeneity induce gradient conflicts on the server during federated learning, hindering convergence. The authors propose FedGH, a gradient harmonization method that computes pairwise cosine similarities among client gradients $g_t^k$ and, for conflicting pairs, applies an orthogonal projection using $g_t^i \leftarrow g_t^i - \frac{g_t^i \cdot \widetilde{g}_t^j}{\|\widetilde{g}_t^j\|^2} \widetilde{g}_t^j$ and $g_t^j \leftarrow g_t^j - \frac{g_t^j \cdot \widetilde{g}_t^i}{\|\widetilde{g}_t^i\|^2} \widetilde{g}_t^i$, before aggregating with weights $\frac{n_k}{n}$, aiming to harmonize updates. The global objective is $L(w) = \sum_{k=1}^{K} \frac{n_k}{n} L_k(w)$, and the method is designed as a plug-and-play server-side module with no hyperparameter tuning. Empirical results across CIFAR-10/100, Tiny-ImageNet, and LEAF show FedGH consistently improves baselines (FedAvg, FedProx, FedNova, FedDecorr), with larger gains under stronger non-IIDness and notable reductions in communication rounds. Overall, FedGH offers a simple, effective mechanism to mitigate gradient conflicts in heterogeneous FL, enhancing convergence and practical deployment.
Abstract
Federated learning (FL) is a privacy-preserving paradigm for collaboratively training a global model from decentralized clients. However, the performance of FL is hindered by non-independent and identically distributed (non-IID) data and device heterogeneity. In this work, we revisit this key challenge through the lens of gradient conflicts on the server side. Specifically, we first investigate the gradient conflict phenomenon among multiple clients and reveal that stronger heterogeneity leads to more severe gradient conflicts. To tackle this issue, we propose FedGH, a simple yet effective method that mitigates local drifts through Gradient Harmonization. This technique projects one gradient vector onto the orthogonal plane of the other within conflicting client pairs. Extensive experiments demonstrate that FedGH consistently enhances multiple state-of-the-art FL baselines across diverse benchmarks and non-IID scenarios. Notably, FedGH yields more significant improvements in scenarios with stronger heterogeneity. As a plug-and-play module, FedGH can be seamlessly integrated into any FL framework without requiring hyperparameter tuning.
