Byzantine-Robust Federated Learning with Learnable Aggregation Weights
Javad Parsa, Amir Hossein Daghestani, André M. H. Teixeira, Mikael Johansson
TL;DR
This work tackles Byzantine-robust federated learning under data heterogeneity by introducing FedLAW, a framework where aggregation weights are learnable and regularized via a sparse unit-capped simplex. The method jointly optimizes the global model ${oldsymbol heta}$ and the weights ${f w}$ through an alternating minimization scheme, using an inner quadratic-approximation step for ${oldsymbol heta}$ and a proximal, three-step projection to enforce sparsity and a unit-sum constraint on ${f w}$. Theoretical results establish Byzantine resilience and convergence to a neighborhood of the optimum under non-iid data and adversarial updates, with bounds that depend on asymptotic bias and variance quantities. Empirically, FedLAW consistently outperforms classical Byzantine defenses on MNIST and CIFAR-10 across multiple attack types and levels of data heterogeneity, and its learned weights suppress malicious clients rapidly, highlighting its practical robustness for secure federated deployments.
Abstract
Federated Learning (FL) enables clients to collaboratively train a global model without sharing their private data. However, the presence of malicious (Byzantine) clients poses significant challenges to the robustness of FL, particularly when data distributions across clients are heterogeneous. In this paper, we propose a novel Byzantine-robust FL optimization problem that incorporates adaptive weighting into the aggregation process. Unlike conventional approaches, our formulation treats aggregation weights as learnable parameters, jointly optimizing them alongside the global model parameters. To solve this optimization problem, we develop an alternating minimization algorithm with strong convergence guarantees under adversarial attack. We analyze the Byzantine resilience of the proposed objective. We evaluate the performance of our algorithm against state-of-the-art Byzantine-robust FL approaches across various datasets and attack scenarios. Experimental results demonstrate that our method consistently outperforms existing approaches, particularly in settings with highly heterogeneous data and a large proportion of malicious clients.
