Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

Yiling Xie; Xiaoming Huo

Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

Yiling Xie, Xiaoming Huo

TL;DR

This work analyzes adversarial training under $\ell_\infty$ perturbations within generalized linear models, revealing that the asymptotic distribution of the estimator can place mass at zero when $\beta^*=0$ for the critical perturbation order $\delta_n=\eta/\sqrt{n}$. By decomposing the regularization effect of the inner max, it shows a gradient-based regularization coupled with an $\ell_1$ penalty, explaining the observed sparsity phenomena. The authors propose adaptive adversarial training, a two-step procedure that uses the ERM estimator to weight perturbations, achieving asymptotic variable-selection consistency and, for $1/2<\gamma<1$, asymptotic unbiasedness. Through rigorous theory and extensive simulations and real-data experiments, the paper demonstrates superior sparsity-recovery and estimation accuracy for adaptive adversarial training compared with classic adversarial training, with practical implications for robust and parsimonious modeling. The findings bridge distributionally robust optimization, LASSO-type regularization, and adversarial robustness, offering a principled approach to sparse, robust inference under worst-case perturbations.

Abstract

Adversarial training has been proposed to protect machine learning models against adversarial attacks. This paper focuses on adversarial training under $\ell_\infty$-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the asymptotic distribution of the adversarial training estimator under $\ell_\infty$-perturbation could put a positive probability mass at $0$ when the true parameter is $0$, providing a theoretical guarantee of the associated sparsity-recovery ability. Alternatively, a two-step procedure is proposed -- adaptive adversarial training, which could further improve the performance of adversarial training under $\ell_\infty$-perturbation. Specifically, the proposed procedure could achieve asymptotic variable-selection consistency and unbiasedness. Numerical experiments are conducted to show the sparsity-recovery ability of adversarial training under $\ell_\infty$-perturbation and to compare the empirical performance between classic adversarial training and adaptive adversarial training.

Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

TL;DR

This work analyzes adversarial training under

perturbations within generalized linear models, revealing that the asymptotic distribution of the estimator can place mass at zero when

for the critical perturbation order

. By decomposing the regularization effect of the inner max, it shows a gradient-based regularization coupled with an

penalty, explaining the observed sparsity phenomena. The authors propose adaptive adversarial training, a two-step procedure that uses the ERM estimator to weight perturbations, achieving asymptotic variable-selection consistency and, for

, asymptotic unbiasedness. Through rigorous theory and extensive simulations and real-data experiments, the paper demonstrates superior sparsity-recovery and estimation accuracy for adaptive adversarial training compared with classic adversarial training, with practical implications for robust and parsimonious modeling. The findings bridge distributionally robust optimization, LASSO-type regularization, and adversarial robustness, offering a principled approach to sparse, robust inference under worst-case perturbations.

Abstract

Adversarial training has been proposed to protect machine learning models against adversarial attacks. This paper focuses on adversarial training under

-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the asymptotic distribution of the adversarial training estimator under

-perturbation could put a positive probability mass at

when the true parameter is

, providing a theoretical guarantee of the associated sparsity-recovery ability. Alternatively, a two-step procedure is proposed -- adaptive adversarial training, which could further improve the performance of adversarial training under

-perturbation. Specifically, the proposed procedure could achieve asymptotic variable-selection consistency and unbiasedness. Numerical experiments are conducted to show the sparsity-recovery ability of adversarial training under

-perturbation and to compare the empirical performance between classic adversarial training and adaptive adversarial training.

Paper Structure (35 sections, 19 theorems, 154 equations, 4 figures, 4 tables)

This paper contains 35 sections, 19 theorems, 154 equations, 4 figures, 4 tables.

Introduction
Related Work
Notations and Definitions
Organization of this Paper
Regularization Effect of Adversarial Training
Asymptotic Behavior in Generalized Linear Model
Sparsity-recovery Ability
Adaptive Adversarial Training
Statistical Properties of Adaptive Adversarial Training
Numerical Experiments
Tractable Reformulation
Synthetic-data Numerical Experiments
Experimental Setting
Experimental Results
Real-data Numerical Experiments
...and 20 more sections

Key Result

Proposition 2.1

If the function $h:\mathbb{R}^d\to\mathbb{R}$ is differentiable, $\nabla h$ is uniformly continuous, and $\mathbb{E}_{\bm{Z}\sim P}\left[ \Vert\nabla h(\bm{Z})\Vert_\ast\right]<\infty$, then we have that as $\delta\to 0$.

Figures (4)

Figure 1: Coefficient Path in the Linear Regression
Figure 2: Coefficient Path in the Logistic Regression
Figure 3: Coefficient Path in the Linear Regression
Figure 4: Coefficient Path in the Logistic Regression

Theorems & Definitions (41)

Proposition 2.1: Regularization Effect
Remark 2.2
Corollary 3.1
Proposition 3.3
Proposition 3.4
Theorem 3.5: Asymptotic Behavior
Remark 3.6
Remark 3.7
Proposition 3.8: Sparsity-recovery Ability
Proposition 4.1: Regularization Effect of Adaptive Technique
...and 31 more

Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

TL;DR

Abstract

Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (41)