Table of Contents
Fetching ...

On the Byzantine Fault Tolerance of signSGD with Majority Vote

Emanuele Mengoli, Luzius Moll, Virgilio Strozzi, El-Mahdi El-Mhamdi

TL;DR

This work derives an explicit probabilistic bound on signSGD with majority vote in terms of incorrect aggregation without resorting to unknown constants, providing a convergence bound on signSGD with majority vote in the presence of Byzantine attackers, along with a precise convergence rate.

Abstract

In distributed learning, sign-based compression algorithms such as signSGD with majority vote provide a lightweight alternative to SGD with an additional advantage: fault tolerance (almost) for free. However, for signSGD with majority vote, this fault tolerance has been shown to cover only the case of weaker adversaries, i.e., ones that are not omniscient or cannot collude to base their attack on common knowledge and strategy. In this work, we close this gap and provide new insights into how signSGD with majority vote can be resilient against omniscient and colluding adversaries, which craft an attack after communicating with other adversaries, thus having better information to perform the most damaging attack based on a common optimal strategy. Our core contribution is in providing a proof that begins by defining the omniscience framework and the strongest possible damage against signSGD with majority vote without imposing any restrictions on the attacker. Thanks to the filtering effect of the sign-based method, we upper-bound the space of attacks to the optimal strategy for maximizing damage by an attacker. Hence, we derive an explicit probabilistic bound in terms of incorrect aggregation without resorting to unknown constants, providing a convergence bound on signSGD with majority vote in the presence of Byzantine attackers, along with a precise convergence rate. Our findings are supported by experiments on the MNIST dataset in a distributed learning environment with adversaries of varying strength.

On the Byzantine Fault Tolerance of signSGD with Majority Vote

TL;DR

This work derives an explicit probabilistic bound on signSGD with majority vote in terms of incorrect aggregation without resorting to unknown constants, providing a convergence bound on signSGD with majority vote in the presence of Byzantine attackers, along with a precise convergence rate.

Abstract

In distributed learning, sign-based compression algorithms such as signSGD with majority vote provide a lightweight alternative to SGD with an additional advantage: fault tolerance (almost) for free. However, for signSGD with majority vote, this fault tolerance has been shown to cover only the case of weaker adversaries, i.e., ones that are not omniscient or cannot collude to base their attack on common knowledge and strategy. In this work, we close this gap and provide new insights into how signSGD with majority vote can be resilient against omniscient and colluding adversaries, which craft an attack after communicating with other adversaries, thus having better information to perform the most damaging attack based on a common optimal strategy. Our core contribution is in providing a proof that begins by defining the omniscience framework and the strongest possible damage against signSGD with majority vote without imposing any restrictions on the attacker. Thanks to the filtering effect of the sign-based method, we upper-bound the space of attacks to the optimal strategy for maximizing damage by an attacker. Hence, we derive an explicit probabilistic bound in terms of incorrect aggregation without resorting to unknown constants, providing a convergence bound on signSGD with majority vote in the presence of Byzantine attackers, along with a precise convergence rate. Our findings are supported by experiments on the MNIST dataset in a distributed learning environment with adversaries of varying strength.

Paper Structure

This paper contains 20 sections, 3 theorems, 16 equations, 2 figures, 3 tables, 1 algorithm.

Key Result

Theorem 3.5

(Strongest DamageNote that this is about bounding the damage of an arbitrary attack, and not assuming a particular attack. on signSGD with Majority Vote). The strongest attack to maximally damage the objective function $f(x)$ at time $t$ that omniscient adversaries (controlling $\alpha Q$ workers) c forcing the majority vote toward the opposite of $g(x)_t$.

Figures (2)

  • Figure 1: Influence of the batch size on the convergence of signSGD with majority vote in the presence of omniscient adversaries, for the toy example with 27 workers and varying numbers of adversaries.
  • Figure 2: Training loss and test accuracy for batch size 64 and 500 iterations shows no convergence for more than 33% adversaries.

Theorems & Definitions (7)

  • Definition 2.1
  • Theorem 3.5
  • proof
  • Lemma 3.6
  • Theorem 3.7
  • proof
  • proof