Table of Contents
Fetching ...

Adversarial Training in Low-Label Regimes with Margin-Based Interpolation

Tian Ye, Rajgopal Kannan, Viktor Prasanna

TL;DR

This paper introduces a novel semi-supervised adversarial training approach that enhances both robustness and natural accuracy by generating effective adversarial examples and proposes a global epsilon scheduling strategy that progressively adjusts the upper bound of perturbation strengths during training.

Abstract

Adversarial training has emerged as an effective approach to train robust neural network models that are resistant to adversarial attacks, even in low-label regimes where labeled data is scarce. In this paper, we introduce a novel semi-supervised adversarial training approach that enhances both robustness and natural accuracy by generating effective adversarial examples. Our method begins by applying linear interpolation between clean and adversarial examples to create interpolated adversarial examples that cross decision boundaries by a controlled margin. This sample-aware strategy tailors adversarial examples to the characteristics of each data point, enabling the model to learn from the most informative perturbations. Additionally, we propose a global epsilon scheduling strategy that progressively adjusts the upper bound of perturbation strengths during training. The combination of these strategies allows the model to develop increasingly complex decision boundaries with better robustness and natural accuracy. Empirical evaluations show that our approach effectively enhances performance against various adversarial attacks, such as PGD and AutoAttack.

Adversarial Training in Low-Label Regimes with Margin-Based Interpolation

TL;DR

This paper introduces a novel semi-supervised adversarial training approach that enhances both robustness and natural accuracy by generating effective adversarial examples and proposes a global epsilon scheduling strategy that progressively adjusts the upper bound of perturbation strengths during training.

Abstract

Adversarial training has emerged as an effective approach to train robust neural network models that are resistant to adversarial attacks, even in low-label regimes where labeled data is scarce. In this paper, we introduce a novel semi-supervised adversarial training approach that enhances both robustness and natural accuracy by generating effective adversarial examples. Our method begins by applying linear interpolation between clean and adversarial examples to create interpolated adversarial examples that cross decision boundaries by a controlled margin. This sample-aware strategy tailors adversarial examples to the characteristics of each data point, enabling the model to learn from the most informative perturbations. Additionally, we propose a global epsilon scheduling strategy that progressively adjusts the upper bound of perturbation strengths during training. The combination of these strategies allows the model to develop increasingly complex decision boundaries with better robustness and natural accuracy. Empirical evaluations show that our approach effectively enhances performance against various adversarial attacks, such as PGD and AutoAttack.

Paper Structure

This paper contains 28 sections, 21 equations, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of updating the decision boundary of a classifier with interpolated adversarial examples.
  • Figure 2: Empirical supports for Assumption 1 in \ref{['sec:intp']} and Assumption 2 in \ref{['sec:insights']}, based on three different datasets (CIFAR-10, SVHN, CIFAR-100) at epochs 40 and 70. (a) The value of margin function $d(\alpha; x_i,x_i^\text{pgd})$ for 20 randomly sampled training data points as $\alpha$ varies from 0 to 1. (b) Histogram of the loss ratio $\ell_\text{ce}(f_\theta(x_i^\text{adv}(\hat{\alpha})),\tilde{y}_i)/\ell_\text{ce}(f_\theta(\widehat{x_i}^\text{pgd}),\tilde{y}_i)$ as defined in Assumption 2 in \ref{['sec:insights']}.
  • Figure 3: Global epsilon scheduling strategies.
  • Figure 4: Performance of SSAT-MBI with varying $\beta$ on CIFAR-10 using WideResNet-28-5. The performance is measured by natural accuracy and robustness against PGD-20 and AutoAttack.
  • Figure 5: Performance of SSAT-MBI with varying $\rho$ on CIFAR-10 using WideResNet-28-5. The performance is measured by natural accuracy and robustness against PGD-20 and AutoAttack. "Curious" refers to global epsilon scheduling Curious-$(1.25, 70)$. "Const" means no global epsilon scheduling. "beta" refers to the hyperparameter $\beta$ introduced in \ref{['sec:half']}.