Byzantine-Resilient Zero-Order Optimization for Communication-Efficient Heterogeneous Federated Learning
Maximilian Egger, Mayank Bakshi, Rawad Bitar
TL;DR
CyBeR-0 tackles Byzantine-resilient federated learning under data heterogeneity using zero-order optimization. It introduces transformed robust aggregation performed in the perturbation embedding space, enabled by Johnson-Lindenstrauss embeddings and a shared pseudorandom seed for perturbation directions, achieving drastic communication savings while preserving robustness. The framework supports multiple local ZO epochs, works with common robust rules (CWTM, Krum, NNM), and provides convergence guarantees for non-convex objectives. Empirical results on MNIST and RoBERTa-large fine-tuning demonstrate strong worst-case robustness with up to seven orders of magnitude reduction in communication and reduced memory usage, making it suitable for resource-constrained edge deployments.
Abstract
We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that CyBeR-0 exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.
