Table of Contents
Fetching ...

Approximate Agreement Algorithms for Byzantine Collaborative Learning

Mélanie Cambus, Darya Melnyk, Tijana Milentijević, Stefan Schmid

TL;DR

This work tackles Byzantine-tolerant distributed learning by examining gradient aggregation via the geometric median and its interaction with approximate agreement. It shows that traditional safe-area and MD-based approaches fail to bound the geometric-median error or to converge, and introduces a hyperbox-based geometric-median algorithm that guarantees convergence and achieves a bound of $2\sqrt{d}$ relative to the minimum covering ball radius. The approach is validated empirically in centralized and decentralized settings under sign-flip Byzantine attacks on non-$i.i.d.$ data, demonstrating stronger resilience than mean-based methods across MNIST and CIFAR10 tasks. The results advance robust aggregation for distributed learning under severe heterogeneity and adversarial conditions, and point to future work on additional aggregation rules and cross-round Byzantine strategies.

Abstract

In Byzantine collaborative learning, $n$ clients in a peer-to-peer network collectively learn a model without sharing their data by exchanging and aggregating stochastic gradient estimates. Byzantine clients can prevent others from collecting identical sets of gradient estimates. The aggregation step thus needs to be combined with an efficient (approximate) agreement subroutine to ensure convergence of the training process. In this work, we study the geometric median aggregation rule for Byzantine collaborative learning. We show that known approaches do not provide theoretical guarantees on convergence or gradient quality in the agreement subroutine. To satisfy these theoretical guarantees, we present a hyperbox algorithm for geometric median aggregation. We practically evaluate our algorithm in both centralized and decentralized settings under Byzantine attacks on non-i.i.d. data. We show that our geometric median-based approaches can tolerate sign-flip attacks better than known mean-based approaches from the literature.

Approximate Agreement Algorithms for Byzantine Collaborative Learning

TL;DR

This work tackles Byzantine-tolerant distributed learning by examining gradient aggregation via the geometric median and its interaction with approximate agreement. It shows that traditional safe-area and MD-based approaches fail to bound the geometric-median error or to converge, and introduces a hyperbox-based geometric-median algorithm that guarantees convergence and achieves a bound of relative to the minimum covering ball radius. The approach is validated empirically in centralized and decentralized settings under sign-flip Byzantine attacks on non- data, demonstrating stronger resilience than mean-based methods across MNIST and CIFAR10 tasks. The results advance robust aggregation for distributed learning under severe heterogeneity and adversarial conditions, and point to future work on additional aggregation rules and cross-round Byzantine strategies.

Abstract

In Byzantine collaborative learning, clients in a peer-to-peer network collectively learn a model without sharing their data by exchanging and aggregating stochastic gradient estimates. Byzantine clients can prevent others from collecting identical sets of gradient estimates. The aggregation step thus needs to be combined with an efficient (approximate) agreement subroutine to ensure convergence of the training process. In this work, we study the geometric median aggregation rule for Byzantine collaborative learning. We show that known approaches do not provide theoretical guarantees on convergence or gradient quality in the agreement subroutine. To satisfy these theoretical guarantees, we present a hyperbox algorithm for geometric median aggregation. We practically evaluate our algorithm in both centralized and decentralized settings under Byzantine attacks on non-i.i.d. data. We show that our geometric median-based approaches can tolerate sign-flip attacks better than known mean-based approaches from the literature.

Paper Structure

This paper contains 20 sections, 5 theorems, 19 equations, 3 figures, 2 algorithms.

Key Result

Lemma 3.2

The true geometric median $\mu^*\xspace$ is inside the convex hull of possible geometric medians of each correct node: $\mu^*\xspace \in \mathrm{Conv}\xspace(S_{\mathrm{geo}}\xspace(i))$, for all $i \in [n]$.

Figures (3)

  • Figure 1: Centralized collaborative learning with $f=1$ on MLP architecture and MNIST dataset, under different heterogeneity
  • Figure 2: Centralized collaborative learning on MLP and CifarNet, using MNIST and CIFAR10 dataset
  • Figure 3: Decentralized collaborative learning model on MLP architecture with mild heterogeneous data

Theorems & Definitions (21)

  • Definition 2.1: Mean
  • Definition 2.2: Geometric median
  • Definition 2.3: Safe area multidim-approx-agreement
  • Definition 2.4: Trusted hyperbox
  • Definition 2.5: Locally trusted hyperbox
  • Definition 3.1
  • Lemma 3.2
  • proof
  • Definition 3.3
  • Definition 3.4
  • ...and 11 more