Table of Contents
Fetching ...

Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

Yiling Xie, Xiaoming Huo

TL;DR

This work analyzes adversarial training under $\ell_\infty$ perturbations within generalized linear models, revealing that the asymptotic distribution of the estimator can place mass at zero when $\beta^*=0$ for the critical perturbation order $\delta_n=\eta/\sqrt{n}$. By decomposing the regularization effect of the inner max, it shows a gradient-based regularization coupled with an $\ell_1$ penalty, explaining the observed sparsity phenomena. The authors propose adaptive adversarial training, a two-step procedure that uses the ERM estimator to weight perturbations, achieving asymptotic variable-selection consistency and, for $1/2<\gamma<1$, asymptotic unbiasedness. Through rigorous theory and extensive simulations and real-data experiments, the paper demonstrates superior sparsity-recovery and estimation accuracy for adaptive adversarial training compared with classic adversarial training, with practical implications for robust and parsimonious modeling. The findings bridge distributionally robust optimization, LASSO-type regularization, and adversarial robustness, offering a principled approach to sparse, robust inference under worst-case perturbations.

Abstract

Adversarial training has been proposed to protect machine learning models against adversarial attacks. This paper focuses on adversarial training under $\ell_\infty$-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the asymptotic distribution of the adversarial training estimator under $\ell_\infty$-perturbation could put a positive probability mass at $0$ when the true parameter is $0$, providing a theoretical guarantee of the associated sparsity-recovery ability. Alternatively, a two-step procedure is proposed -- adaptive adversarial training, which could further improve the performance of adversarial training under $\ell_\infty$-perturbation. Specifically, the proposed procedure could achieve asymptotic variable-selection consistency and unbiasedness. Numerical experiments are conducted to show the sparsity-recovery ability of adversarial training under $\ell_\infty$-perturbation and to compare the empirical performance between classic adversarial training and adaptive adversarial training.

Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

TL;DR

This work analyzes adversarial training under perturbations within generalized linear models, revealing that the asymptotic distribution of the estimator can place mass at zero when for the critical perturbation order . By decomposing the regularization effect of the inner max, it shows a gradient-based regularization coupled with an penalty, explaining the observed sparsity phenomena. The authors propose adaptive adversarial training, a two-step procedure that uses the ERM estimator to weight perturbations, achieving asymptotic variable-selection consistency and, for , asymptotic unbiasedness. Through rigorous theory and extensive simulations and real-data experiments, the paper demonstrates superior sparsity-recovery and estimation accuracy for adaptive adversarial training compared with classic adversarial training, with practical implications for robust and parsimonious modeling. The findings bridge distributionally robust optimization, LASSO-type regularization, and adversarial robustness, offering a principled approach to sparse, robust inference under worst-case perturbations.

Abstract

Adversarial training has been proposed to protect machine learning models against adversarial attacks. This paper focuses on adversarial training under -perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the asymptotic distribution of the adversarial training estimator under -perturbation could put a positive probability mass at when the true parameter is , providing a theoretical guarantee of the associated sparsity-recovery ability. Alternatively, a two-step procedure is proposed -- adaptive adversarial training, which could further improve the performance of adversarial training under -perturbation. Specifically, the proposed procedure could achieve asymptotic variable-selection consistency and unbiasedness. Numerical experiments are conducted to show the sparsity-recovery ability of adversarial training under -perturbation and to compare the empirical performance between classic adversarial training and adaptive adversarial training.
Paper Structure (35 sections, 19 theorems, 154 equations, 4 figures, 4 tables)

This paper contains 35 sections, 19 theorems, 154 equations, 4 figures, 4 tables.

Key Result

Proposition 2.1

If the function $h:\mathbb{R}^d\to\mathbb{R}$ is differentiable, $\nabla h$ is uniformly continuous, and $\mathbb{E}_{\bm{Z}\sim P}\left[ \Vert\nabla h(\bm{Z})\Vert_\ast\right]<\infty$, then we have that as $\delta\to 0$.

Figures (4)

  • Figure 1: Coefficient Path in the Linear Regression
  • Figure 2: Coefficient Path in the Logistic Regression
  • Figure 3: Coefficient Path in the Linear Regression
  • Figure 4: Coefficient Path in the Logistic Regression

Theorems & Definitions (41)

  • Proposition 2.1: Regularization Effect
  • Remark 2.2
  • Corollary 3.1
  • Proposition 3.3
  • Proposition 3.4
  • Theorem 3.5: Asymptotic Behavior
  • Remark 3.6
  • Remark 3.7
  • Proposition 3.8: Sparsity-recovery Ability
  • Proposition 4.1: Regularization Effect of Adaptive Technique
  • ...and 31 more