Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning
Yun-Hin Chan, Rui Zhou, Running Zhao, Zhihan Jiang, Edith C. -H. Ngai
TL;DR
Federated learning must contend with system heterogeneity across clients, which hampers the performance of model-homogeneous FL methods. The authors introduce InCo Aggregation, a server-side strategy that leverages internal cross-layer gradients by mixing shallow and deep layer gradients, applying gradient normalization, and solving a convex optimization to align gradient directions, thereby enhancing deep-layer similarity without extra client communication. They establish non-convex convergence and rate guarantees and demonstrate broad empirical gains across CNNs (ResNets) and transformers (ViTs), improving both traditional homogeneous baselines and heterogeneous FL methods. The approach offers a practical, scalable pathway to robust FL under realistic heterogeneity, with minimal overhead and strong applicability to common architectures.
Abstract
Federated learning (FL) inevitably confronts the challenge of system heterogeneity in practical scenarios. To enhance the capabilities of most model-homogeneous FL methods in handling system heterogeneity, we propose a training scheme that can extend their capabilities to cope with this challenge. In this paper, we commence our study with a detailed exploration of homogeneous and heterogeneous FL settings and discover three key observations: (1) a positive correlation between client performance and layer similarities, (2) higher similarities in the shallow layers in contrast to the deep layers, and (3) the smoother gradients distributions indicate the higher layer similarities. Building upon these observations, we propose InCo Aggregation that leverages internal cross-layer gradients, a mixture of gradients from shallow and deep layers within a server model, to augment the similarity in the deep layers without requiring additional communication between clients. Furthermore, our methods can be tailored to accommodate model-homogeneous FL methods such as FedAvg, FedProx, FedNova, Scaffold, and MOON, to expand their capabilities to handle the system heterogeneity. Copious experimental results validate the effectiveness of InCo Aggregation, spotlighting internal cross-layer gradients as a promising avenue to enhance the performance in heterogeneous FL.
