Table of Contents
Fetching ...

SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding

Chanho Park, Namyoon Lee

TL;DR

This work addresses the communication bottleneck and adversarial vulnerability of distributed SGD by introducing SignSGD with Federated Defense (signSGD-FD). Unlike traditional signSGD-MV, signSGD-FD uses gradient sign decoding with learnable LLR weights to perform weighted majority voting, enabling a convergence rate that remains invariant to the number of adversarial workers as long as they are fewer than benign workers. The authors provide a unified coding-theoretical interpretation, derive decoding-error bounds under attacks, and validate the approach with experiments on MNIST, CIFAR-10, and CIFAR-100, showing improved convergence and reduced communication costs. The proposed Federated Defense dynamically estimates cross-over probabilities and exploits compromised gradients rather than discarding them, offering practical robustness against a range of attack scenarios and enhancing the scalability of distributed learning in adversarial environments.

Abstract

Distributed learning is an effective approach to accelerate model training using multiple workers. However, substantial communication delays emerge between workers and a parameter server due to massive costs associated with communicating gradients. SignSGD with majority voting (signSGD-MV) is a simple yet effective optimizer that reduces communication costs through one-bit quantization, yet the convergence rates considerably decrease as adversarial workers increase. In this paper, we show that the convergence rate is invariant as the number of adversarial workers increases, provided that the number of adversarial workers is smaller than that of benign workers. The key idea showing this counter-intuitive result is our novel signSGD with federated defense (signSGD-FD). Unlike the traditional approaches, signSGD-FD exploits the gradient information sent by adversarial workers with the proper weights, which are obtained through gradient sign decoding. Experimental results demonstrate signSGD-FD achieves superior convergence rates over traditional algorithms in various adversarial attack scenarios.

SignSGD with Federated Defense: Harnessing Adversarial Attacks through Gradient Sign Decoding

TL;DR

This work addresses the communication bottleneck and adversarial vulnerability of distributed SGD by introducing SignSGD with Federated Defense (signSGD-FD). Unlike traditional signSGD-MV, signSGD-FD uses gradient sign decoding with learnable LLR weights to perform weighted majority voting, enabling a convergence rate that remains invariant to the number of adversarial workers as long as they are fewer than benign workers. The authors provide a unified coding-theoretical interpretation, derive decoding-error bounds under attacks, and validate the approach with experiments on MNIST, CIFAR-10, and CIFAR-100, showing improved convergence and reduced communication costs. The proposed Federated Defense dynamically estimates cross-over probabilities and exploits compromised gradients rather than discarding them, offering practical robustness against a range of attack scenarios and enhancing the scalability of distributed learning in adversarial environments.

Abstract

Distributed learning is an effective approach to accelerate model training using multiple workers. However, substantial communication delays emerge between workers and a parameter server due to massive costs associated with communicating gradients. SignSGD with majority voting (signSGD-MV) is a simple yet effective optimizer that reduces communication costs through one-bit quantization, yet the convergence rates considerably decrease as adversarial workers increase. In this paper, we show that the convergence rate is invariant as the number of adversarial workers increases, provided that the number of adversarial workers is smaller than that of benign workers. The key idea showing this counter-intuitive result is our novel signSGD with federated defense (signSGD-FD). Unlike the traditional approaches, signSGD-FD exploits the gradient information sent by adversarial workers with the proper weights, which are obtained through gradient sign decoding. Experimental results demonstrate signSGD-FD achieves superior convergence rates over traditional algorithms in various adversarial attack scenarios.
Paper Structure (29 sections, 6 theorems, 55 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 29 sections, 6 theorems, 55 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Theorem 4.3

Let $\hat{U}_n^t = A \left( \mathbf{Y}_n^t \right) \in \{-1,+1\}$ be a decoded gradient sign for $n$th coordinate at iteration $t$. We define the maximum of sign decoding error probability over all coordinates and iterations as With a fixed learning parameter $\delta = \sqrt{\frac{2\left( f^1 - f^\star \right)}{T \lVert \mathbf{L} \rVert_1}}$, the convergence rate of signSGD-type algorithms is gi

Figures (4)

  • Figure 1: Test accuracy vs. training rounds varying the number of compromised workers $L$.
  • Figure 2: Test accuracy vs. training rounds varying sign-flipping probability $r$.
  • Figure 3: Test accuracy & communication costs comparison among the attack-robust distributed learning algorithms.
  • Figure 4: Test accuracy comparison according to the initial phase aggregation of signSGD-FD.

Theorems & Definitions (12)

  • Theorem 4.3: Universal convergence rate
  • Theorem 4.4: Decoding error bound of signSGD-FD
  • Theorem 4.5: Decoding error bound of signSGD-MV
  • Theorem 4.6: Decoding error bound for signSGD-FD under the SSFA
  • Corollary 4.7: Special case
  • Theorem 4.8: Decoding error bound of signSGD-MV
  • proof
  • proof
  • proof
  • proof
  • ...and 2 more