Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries
Lihan Xu, Yanjie Dong, Gang Wang, Runhao Zeng, Xiaoyi Fan, Xiping Hu
TL;DR
The paper addresses robust federated learning in the presence of Byzantine adversaries by introducing Byrd-NAFL, which blends Nesterov momentum with Byzantine-resilient aggregation to accelerate and safeguard convergence. It provides a finite-time convergence guarantee for smooth non-convex losses under a soft Byzantine resilience assumption and analyzes how adversarial perturbations, momentum, and stochastic noise affect learning. The authors demonstrate that Byrd-NAFL outperforms baselines on COVTYPE and MNIST across multiple attack types, achieving faster convergence and higher accuracy while maintaining resilience. This approach offers a practical pathway to reliable, communication-efficient FL in adversarial environments, with the ability to leverage momentum without sacrificing robustness.
Abstract
We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. To simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the superiority over existing benchmarks in terms of convergence speed, accuracy, and resilience to diverse Byzantine attack strategies.
