Communication-Efficient Byzantine-Resilient Federated Zero-Order Optimization
Afonso de Sá Delgado Neto, Maximilian Egger, Mayank Bakshi, Rawad Bitar
TL;DR
CyBeR-0 addresses Byzantine-resilient federated learning with memory- and communication-efficient zero-order optimization. It compresses a $d$-dimensional gradient into $k$ scalars via a shared seed for perturbation directions, and employs a trimmed-mean robust aggregator to mitigate adversarial updates, achieving convergence guarantees for convex losses under IID data. Empirically, CyBeR-0 matches or closely approaches non-Byzantine accuracy on MNIST and enables substantial communication savings (up to orders of magnitude) while fine-tuning RoBERTa-Large on NLP tasks under Byzantine attacks. The work combines zero-order estimation, communication compression, and Byzantine robustness to enable practical, robust federated learning in resource-constrained environments.
Abstract
We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults. We show through extensive numerical experiments on the MNIST dataset and finetuning RoBERTa-Large that CYBER-0 outperforms state-of-the-art algorithms in terms of communication and memory efficiency while reaching similar accuracy. We provide theoretical guarantees on its convergence for convex loss functions.
