Table of Contents
Fetching ...

Byzantine-Resilient Zero-Order Optimization for Communication-Efficient Heterogeneous Federated Learning

Maximilian Egger, Mayank Bakshi, Rawad Bitar

TL;DR

CyBeR-0 tackles Byzantine-resilient federated learning under data heterogeneity using zero-order optimization. It introduces transformed robust aggregation performed in the perturbation embedding space, enabled by Johnson-Lindenstrauss embeddings and a shared pseudorandom seed for perturbation directions, achieving drastic communication savings while preserving robustness. The framework supports multiple local ZO epochs, works with common robust rules (CWTM, Krum, NNM), and provides convergence guarantees for non-convex objectives. Empirical results on MNIST and RoBERTa-large fine-tuning demonstrate strong worst-case robustness with up to seven orders of magnitude reduction in communication and reduced memory usage, making it suitable for resource-constrained edge deployments.

Abstract

We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that CyBeR-0 exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.

Byzantine-Resilient Zero-Order Optimization for Communication-Efficient Heterogeneous Federated Learning

TL;DR

CyBeR-0 tackles Byzantine-resilient federated learning under data heterogeneity using zero-order optimization. It introduces transformed robust aggregation performed in the perturbation embedding space, enabled by Johnson-Lindenstrauss embeddings and a shared pseudorandom seed for perturbation directions, achieving drastic communication savings while preserving robustness. The framework supports multiple local ZO epochs, works with common robust rules (CWTM, Krum, NNM), and provides convergence guarantees for non-convex objectives. Empirical results on MNIST and RoBERTa-large fine-tuning demonstrate strong worst-case robustness with up to seven orders of magnitude reduction in communication and reduced memory usage, making it suitable for resource-constrained edge deployments.

Abstract

We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that CyBeR-0 exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.

Paper Structure

This paper contains 39 sections, 17 theorems, 78 equations, 17 figures, 10 tables, 5 algorithms.

Key Result

Proposition 5.5

The ZO estimate satisfies on expectation

Figures (17)

  • Figure 1: Performance of different robust aggregation rules against different attacks for logistic regression on MNIST.
  • Figure 2: Accuracy over epochs for fine-tuning RoBERTa-large on TREC under different attack scenarios for non-i.i.d. data.
  • Figure 3: Comparisons of Local Epoch Strategies for $\mu=0.001$
  • Figure 4: Comparisons of Local Epoch Strategies for $\mu=0$
  • Figure 5: Comparison of Zero-Order Optimization for Different Values of $\nu$ Compared to the Baseline FedAvg.
  • ...and 12 more figures

Theorems & Definitions (33)

  • Definition 2.1: Two-Point Zero-Order Estimate
  • Definition 2.2: $(b, \kappa)$-Robust Aggregation
  • Proposition 5.5
  • Proposition 5.6: Lemma 2, tang2020distributed
  • Theorem 5.7: General non-convex landscapes
  • Theorem 5.9: Lipschitz objective functions
  • proof : Sketch of Proof
  • proof
  • Lemma 2.1
  • Lemma 2.2
  • ...and 23 more