On the Power of Adaptive Weighted Aggregation in Heterogeneous Federated Learning and Beyond
Dun Zeng, Zenglin Xu, Shiyu Liu, Yu Pan, Qifan Wang, Xiaoying Tang
TL;DR
The paper addresses why FedAvg's convergence under heterogeneous clients often defies pessimistic theoretical bounds observed in prior work. It introduces client consensus dynamics and Local Update Diversity (LUD) as practical lenses to understand training dynamics, and proposes FedAWARE, an adaptive weighted aggregation module that minimizes the norm of the aggregated local updates and can plug into existing FL algorithms. The authors prove that, under standard smoothness and unbiasedness assumptions, a decaying consensus measure enables FedAvg to converge with a bound that includes a consensus term; adopting adaptive aggregation reduces this term, yielding faster convergence and improved generalization, with FedAWARE further enlarging LUD to strengthen generalization. Extensive experiments on CIFAR-10/100 and AGNews across multiple architectures demonstrate faster convergence, more stable generalization, and compatibility of FedAWARE as a plug-in, supporting its practical impact in heterogeneous FL deployment.
Abstract
Federated averaging (FedAvg) is the most fundamental algorithm in Federated learning (FL). Previous theoretical results assert that FedAvg convergence and generalization degenerate under heterogeneous clients. However, recent empirical results show that FedAvg can perform well in many real-world heterogeneous tasks. These results reveal an inconsistency between FL theory and practice that is not fully explained. In this paper, we show that common heterogeneity measures contribute to this inconsistency based on rigorous convergence analysis. Furthermore, we introduce a new measure \textit{client consensus dynamics} and prove that \textit{FedAvg can effectively handle client heterogeneity when an appropriate aggregation strategy is used}. Building on this theoretical insight, we present a simple and effective FedAvg variant termed FedAWARE. Extensive experiments on three datasets and two modern neural network architectures demonstrate that FedAWARE ensures faster convergence and better generalization in heterogeneous client settings. Moreover, our results show that FedAWARE can significantly enhance the generalization performance of advanced FL algorithms when used as a plug-in module.
