Achieving Byzantine-Resilient Federated Learning via Layer-Adaptive Sparsified Model Aggregation
Jiahao Xu, Zikai Zhang, Rui Hu
TL;DR
This work addresses Byzantine resilience in Federated Learning by introducing LASA, a two-stage server-side defense that combines per-client Top-$k$ sparsification with layer-wise adaptive aggregation. By capturing both magnitude and direction at the layer level and using Median-based Z-score filtering, LASA mitigates the impact of malicious updates while preserving benign model utility, especially in non-IID settings. The authors formalize a $\kappa$-robustness criterion and prove LASA achieves tight robustness bounds, and they also establish $(f,R)$-Byzantine resilience for FL with LASA. Extensive experiments across IID and non-IID data, multiple attack methods, and varying sparsification and filtering parameters show LASA consistently outperforms state-of-the-art defenses in robustness and accuracy, while maintaining manageable computational costs. The approach offers practical, theoretically grounded protection for real-world FL deployments.
Abstract
Federated Learning (FL) enables multiple clients to collaboratively train a model without sharing their local data. Yet the FL system is vulnerable to well-designed Byzantine attacks, which aim to disrupt the model training process by uploading malicious model updates. Existing robust aggregation rule-based defense methods overlook the diversity of magnitude and direction across different layers of the model updates, resulting in limited robustness performance, particularly in non-IID settings. To address these challenges, we propose the Layer-Adaptive Sparsified Model Aggregation (LASA) approach, which combines pre-aggregation sparsification with layer-wise adaptive aggregation to improve robustness. Specifically, LASA includes a pre-aggregation sparsification module that sparsifies updates from each client before aggregation, reducing the impact of malicious parameters and minimizing the interference from less important parameters for the subsequent filtering process. Based on sparsified updates, a layer-wise adaptive filter then adaptively selects benign layers using both magnitude and direction metrics across all clients for aggregation. We provide the detailed theoretical robustness analysis of LASA and the resilience analysis for the FL integrated with LASA. Extensive experiments are conducted on various IID and non-IID datasets. The numerical results demonstrate the effectiveness of LASA. Code is available at \url{https://github.com/JiiahaoXU/LASA}.
