Table of Contents
Fetching ...

Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning

Thomas Boudou, Batiste Le Bars, Nirupam Gupta, Aurélien Bellet

TL;DR

This work investigates how two common threat models in robust distributed learning—Byzantine failures and data poisoning—affect generalization. Using uniform algorithmic stability, it derives a fundamental gap: data poisoning incurs a tight generalization-degradation bound of $\Theta\left(\frac{f}{n-f}\right)$, while Byzantine failures incur a slower, lower-bound growth of $\Omega\left(\sqrt{\frac{f}{n-2f}}\right)$ in the regime $f\ge n/3$, indicating worse generalization under Byzantine attacks. The authors provide a unified stability framework, establish tight upper and lower bounds for both threat models under robust aggregation (notably SMEA), and relate stability to generalization to prove the separation. They complement theory with empirical validation and discuss practical implications, including the potential for cryptographic safeguards and future work toward aggregations that preserve co-coercivity and beyond. Overall, the paper clarifies why Byzantine attacks can harm unseen performance more than data poisoning and motivates stability-aware design of robust aggregators for better generalization.

Abstract

Robust distributed learning algorithms aim to maintain reliable performance despite the presence of misbehaving workers. Such misbehaviors are commonly modeled as Byzantine failures, allowing arbitrarily corrupted communication, or as data poisoning, a weaker form of corruption restricted to local training data. While prior work shows similar optimization guarantees for both models, an important question remains: How do these threat models impact generalization? Empirical evidence suggests a gap, yet it remains unclear whether it is unavoidable or merely an artifact of suboptimal attacks. We show, for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning. Our findings leverage a tight algorithmic stability analysis of robust distributed learning. Specifically, we prove that: (i) under data poisoning, the uniform algorithmic stability of an algorithm with optimal optimization guarantees degrades by an additive factor of $\varTheta ( \frac{f}{n-f} )$, with $f$ out of $n$ workers misbehaving; whereas $\textit{(ii)}$ under Byzantine failures, the degradation is in $Ω\big( \sqrt{ \frac{f}{n-2f}} \big)$.

Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning

TL;DR

This work investigates how two common threat models in robust distributed learning—Byzantine failures and data poisoning—affect generalization. Using uniform algorithmic stability, it derives a fundamental gap: data poisoning incurs a tight generalization-degradation bound of , while Byzantine failures incur a slower, lower-bound growth of in the regime , indicating worse generalization under Byzantine attacks. The authors provide a unified stability framework, establish tight upper and lower bounds for both threat models under robust aggregation (notably SMEA), and relate stability to generalization to prove the separation. They complement theory with empirical validation and discuss practical implications, including the potential for cryptographic safeguards and future work toward aggregations that preserve co-coercivity and beyond. Overall, the paper clarifies why Byzantine attacks can harm unseen performance more than data poisoning and motivates stability-aware design of robust aggregators for better generalization.

Abstract

Robust distributed learning algorithms aim to maintain reliable performance despite the presence of misbehaving workers. Such misbehaviors are commonly modeled as Byzantine failures, allowing arbitrarily corrupted communication, or as data poisoning, a weaker form of corruption restricted to local training data. While prior work shows similar optimization guarantees for both models, an important question remains: How do these threat models impact generalization? Empirical evidence suggests a gap, yet it remains unclear whether it is unavoidable or merely an artifact of suboptimal attacks. We show, for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning. Our findings leverage a tight algorithmic stability analysis of robust distributed learning. Specifically, we prove that: (i) under data poisoning, the uniform algorithmic stability of an algorithm with optimal optimization guarantees degrades by an additive factor of , with out of workers misbehaving; whereas under Byzantine failures, the degradation is in .

Paper Structure

This paper contains 53 sections, 16 theorems, 109 equations, 1 figure, 1 table, 1 algorithm.

Key Result

Proposition 2.1

If $\mathcal{A}$ is $\varepsilon$-uniformly stable, then

Figures (1)

  • Figure 1: Stability Under Optimal Poisoning And Tailored Byzantine Attacks.

Theorems & Definitions (39)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Proposition 2.1
  • Theorem 3.1
  • Theorem 3.2
  • proof : Proof sketch.
  • Theorem 3.3
  • Theorem 3.4
  • ...and 29 more