Table of Contents
Fetching ...

A Practical and Secure Byzantine Robust Aggregator

De Zhang Lee, Aashish Kolluri, Prateek Saxena, Ee-Chien Chang

TL;DR

This work tackles data-poisoning threats in ML by proposing RandEigen, a Byzantine robust aggregator that achieves a near-optimal, dimension-independent bias with quasi-linear runtime. It introduces two main innovations: a fast randomized method to approximate the dominant eigenvector and a convergence-based stopping rule that replaces fixed thresholds, combined with Johnson-Lindenstrauss dimensionality reduction to reduce computation. Theoretical results establish information-theoretic bias guarantees under a stability model, and extensive experiments across federated and centralized settings show RandEigen effectively mitigates a wide range of attacks, including the adaptive HiDRA attack, with substantial speedups and minimal impact on accuracy when no attack is present. These findings indicate RandEigen is practical for real-world neural network training and can serve as a robust defense against data poisoning in high-dimensional gradient spaces.

Abstract

In machine learning security, one is often faced with the problem of removing outliers from a given set of high-dimensional vectors when computing their average. For example, many variants of data poisoning attacks produce gradient vectors during training that are outliers in the distribution of clean gradients, which bias the computed average used to derive the ML model. Filtering them out before averaging serves as a generic defense strategy. Byzantine robust aggregation is an algorithmic primitive which computes a robust average of vectors, in the presence of an $ε$ fraction of vectors which may have been arbitrarily and adaptively corrupted, such that the resulting bias in the final average is provably bounded. In this paper, we give the first robust aggregator that runs in quasi-linear time in the size of input vectors and provably has near-optimal bias bounds. Our algorithm also does not assume any knowledge of the distribution of clean vectors, nor does it require pre-computing any filtering thresholds from it. This makes it practical to use directly in standard neural network training procedures. We empirically confirm its expected runtime efficiency and its effectiveness in nullifying 10 different ML poisoning attacks.

A Practical and Secure Byzantine Robust Aggregator

TL;DR

This work tackles data-poisoning threats in ML by proposing RandEigen, a Byzantine robust aggregator that achieves a near-optimal, dimension-independent bias with quasi-linear runtime. It introduces two main innovations: a fast randomized method to approximate the dominant eigenvector and a convergence-based stopping rule that replaces fixed thresholds, combined with Johnson-Lindenstrauss dimensionality reduction to reduce computation. Theoretical results establish information-theoretic bias guarantees under a stability model, and extensive experiments across federated and centralized settings show RandEigen effectively mitigates a wide range of attacks, including the adaptive HiDRA attack, with substantial speedups and minimal impact on accuracy when no attack is present. These findings indicate RandEigen is practical for real-world neural network training and can serve as a robust defense against data poisoning in high-dimensional gradient spaces.

Abstract

In machine learning security, one is often faced with the problem of removing outliers from a given set of high-dimensional vectors when computing their average. For example, many variants of data poisoning attacks produce gradient vectors during training that are outliers in the distribution of clean gradients, which bias the computed average used to derive the ML model. Filtering them out before averaging serves as a generic defense strategy. Byzantine robust aggregation is an algorithmic primitive which computes a robust average of vectors, in the presence of an fraction of vectors which may have been arbitrarily and adaptively corrupted, such that the resulting bias in the final average is provably bounded. In this paper, we give the first robust aggregator that runs in quasi-linear time in the size of input vectors and provably has near-optimal bias bounds. Our algorithm also does not assume any knowledge of the distribution of clean vectors, nor does it require pre-computing any filtering thresholds from it. This makes it practical to use directly in standard neural network training procedures. We empirically confirm its expected runtime efficiency and its effectiveness in nullifying 10 different ML poisoning attacks.

Paper Structure

This paper contains 43 sections, 22 theorems, 10 equations, 5 figures, 7 tables, 5 algorithms.

Key Result

theorem 1

Let $S$ be a $(5\epsilon, \delta)$-stable set, where $\epsilon \leq 1/12$, $\delta = \sqrt{20} \sqrt{||\Sigma_S||_2}$, and $X$ is constructed by corrupting an $\epsilon$ proportion of $S$. Suppose the main loop in Algorithm alg:filter_modified terminates after $\tau \leq 2n\epsilon$ iterations with

Figures (5)

  • Figure 1: Illustrations of the gradients gradients sampled at random steps of SGD under three different data poisoning attacks, on image classification (HIDRA DBLP:conf/sp/ChoudharyKS24, DBA xie2019dba, TMA fang2020local) and language (EP yang-etal-2021-careful) models. PC 1 and PC 2 represent the projections (on a log scale) of the gradients onto the eigenvectors corresponding to the largest and second-largest eigenvalues, respectively.
  • Figure 2: Subfigure (\ref{['subfloat:a']}) illustrates that RandEigen admits a quasi-linear runtime with respect to $d$. Subfigure (\ref{['subfloat:a']}) illustrates the ratio between the time taken for FILTERING and RandEigen for various chunk sizes. For both plots, we include the theoretically expected behavior for comparison.
  • Figure 3: Model accuracy over training steps when RandEigen is applied to models trained on MNIST and F-MNIST datasets, under different choices of $\epsilon$, remains high.
  • Figure 4: Model accuracy over training steps when the RandEigen defense is applied to BERT language model trained on SST dataset, under different choices of $\epsilon$, remains high.
  • Figure 5: Distribution of the poisoned and clean gradients projected against the dominant eigenvector (PC 1) for the Gradient Matching, Subpopulation Matching and DFBA attacks in Figures \ref{['subfloat:a_gm']}, \ref{['subfloat:b_sp']} and \ref{['subfloat:c_dfba']}, respectively. The gradients shown in Figures \ref{['subfloat:a_gm']} and \ref{['subfloat:b_sp']} were sampled from the first training step, whereas those in Figure \ref{['subfloat:c_dfba']} were from a randomly sampled training step.

Theorems & Definitions (25)

  • definition 1: $(\epsilon, \delta)$-stability
  • theorem 1
  • theorem 2
  • theorem 2
  • Remark 1
  • theorem 3
  • corollary 1
  • theorem 4
  • theorem 5
  • lemma 1: Balancing Lemma
  • ...and 15 more