Table of Contents
Fetching ...

Data-Dependent Stability Analysis of Adversarial Training

Yihan Wang, Shuang Liu, Xiao-Shan Gao

TL;DR

The paper introduces data-dependent stability analysis for SGD-based adversarial training, deriving generalization bounds that incorporate the data distribution via on-average stability. For convex adversarial losses, the bound hinges on the adversarial population risk at initialization and gradient variance; for non-convex losses, an additional curvature term involving the Hessian appears under approximate Hessian Lipschitz assumptions. The bounds scale with the adversarial budget ${\epsilon}$ and remain consistent with standard training when ${\epsilon}=0$, while explicitly accounting for initialization and data-poisoning distribution shifts. Empirical results on multiple datasets validate the theory, showing how robust generalization degrades with larger budgets and how poisoning attacks influence robustness in a manner aligned with the proposed bounds.

Abstract

Stability analysis is an essential aspect of studying the generalization ability of deep learning, as it involves deriving generalization bounds for stochastic gradient descent-based training algorithms. Adversarial training is the most widely used defense against adversarial example attacks. However, previous generalization bounds for adversarial training have not included information regarding the data distribution. In this paper, we fill this gap by providing generalization bounds for stochastic gradient descent-based adversarial training that incorporate data distribution information. We utilize the concepts of on-average stability and high-order approximate Lipschitz conditions to examine how changes in data distribution and adversarial budget can affect robust generalization gaps. Our derived generalization bounds for both convex and non-convex losses are at least as good as the uniform stability-based counterparts which do not include data distribution information. Furthermore, our findings demonstrate how distribution shifts from data poisoning attacks can impact robust generalization.

Data-Dependent Stability Analysis of Adversarial Training

TL;DR

The paper introduces data-dependent stability analysis for SGD-based adversarial training, deriving generalization bounds that incorporate the data distribution via on-average stability. For convex adversarial losses, the bound hinges on the adversarial population risk at initialization and gradient variance; for non-convex losses, an additional curvature term involving the Hessian appears under approximate Hessian Lipschitz assumptions. The bounds scale with the adversarial budget and remain consistent with standard training when , while explicitly accounting for initialization and data-poisoning distribution shifts. Empirical results on multiple datasets validate the theory, showing how robust generalization degrades with larger budgets and how poisoning attacks influence robustness in a manner aligned with the proposed bounds.

Abstract

Stability analysis is an essential aspect of studying the generalization ability of deep learning, as it involves deriving generalization bounds for stochastic gradient descent-based training algorithms. Adversarial training is the most widely used defense against adversarial example attacks. However, previous generalization bounds for adversarial training have not included information regarding the data distribution. In this paper, we fill this gap by providing generalization bounds for stochastic gradient descent-based adversarial training that incorporate data distribution information. We utilize the concepts of on-average stability and high-order approximate Lipschitz conditions to examine how changes in data distribution and adversarial budget can affect robust generalization gaps. Our derived generalization bounds for both convex and non-convex losses are at least as good as the uniform stability-based counterparts which do not include data distribution information. Furthermore, our findings demonstrate how distribution shifts from data poisoning attacks can impact robust generalization.
Paper Structure (28 sections, 15 theorems, 96 equations, 4 figures)

This paper contains 28 sections, 15 theorems, 96 equations, 4 figures.

Key Result

Theorem 2

If ${\mathcal{A}}$ is $\varepsilon$-on-average stable, then the robust generalization gap of ${\mathcal{A}}$ is bounded by $\varepsilon$:

Figures (4)

  • Figure 1: The robust performance of adversarial training with different with the AT budget $\epsilon$ ranging from $0$ to $8/255$.
  • Figure 2: The robust overfitting phenomenon.
  • Figure 3: The robust generalization and robust test accuracy on poisoned CIFAR-10 under different stability attacks. The adversarial training budget $\epsilon=4/255$ and the poisoning budget $\epsilon'=8/255$.
  • Figure 4: The robust generalization and robust test accuracy on the poisoned data under HYP attack with different poisoning budgets. The adversarial training budget $\epsilon=4/255$ and the poisoning budget $\epsilon'$ varies.

Theorems & Definitions (31)

  • Definition 1: On-Average Stability
  • Theorem 2: kuzborskij2018data
  • Remark 6
  • Definition 7
  • Lemma 8
  • Theorem 9
  • Remark 10
  • corollary 11
  • Theorem 12
  • Theorem 13: Multiple-pass Case
  • ...and 21 more