On robust overfitting: adversarial training induced distribution matters

Runzhi Tian; Yongyi Mao

On robust overfitting: adversarial training induced distribution matters

Runzhi Tian, Yongyi Mao

TL;DR

Robust overfitting in adversarial training is linked to the generalization difficulty of perturbation-induced distributions $\tilde{\mathcal D}_t$ encountered along the PGD-AT trajectory. The authors propose induced distribution experiments (IDE) and prove a generalization bound that ties the generalization gap to the expected local dispersion $\mathbb{E}_{\mathcal D^*}\tilde{\gamma}_t(x,y)$ of the perturbation operator $\mathcal Q_{x,y,\theta_t}$. Empirically, $\tilde{\gamma}_t(x,y)$ grows during training and tracks IDE errors across CIFAR-10/100, MNIST, and Reduced ImageNet, with angular-dispersion analyses offering a mechanism via the changing decision boundary. This distribution-dynamics perspective reframes robust generalization as a dynamical phenomenon and suggests new directions for mitigating robust overfitting by controlling perturbation dispersion.

Abstract

Adversarial training may be regarded as standard training with a modified loss function. But its generalization error appears much larger than standard training under standard loss. This phenomenon, known as robust overfitting, has attracted significant research attention and remains largely as a mystery. In this paper, we first show empirically that robust overfitting correlates with the increasing generalization difficulty of the perturbation-induced distributions along the trajectory of adversarial training (specifically PGD-based adversarial training). We then provide a novel upper bound for generalization error with respect to the perturbation-induced distributions, in which a notion of the perturbation operator, referred to "local dispersion", plays an important role. Experimental results are presented to validate the usefulness of the bound and various additional insights are provided.

On robust overfitting: adversarial training induced distribution matters

TL;DR

Robust overfitting in adversarial training is linked to the generalization difficulty of perturbation-induced distributions

encountered along the PGD-AT trajectory. The authors propose induced distribution experiments (IDE) and prove a generalization bound that ties the generalization gap to the expected local dispersion

of the perturbation operator

. Empirically,

grows during training and tracks IDE errors across CIFAR-10/100, MNIST, and Reduced ImageNet, with angular-dispersion analyses offering a mechanism via the changing decision boundary. This distribution-dynamics perspective reframes robust generalization as a dynamical phenomenon and suggests new directions for mitigating robust overfitting by controlling perturbation dispersion.

Abstract

Paper Structure (11 sections, 1 theorem, 22 equations, 11 figures, 3 tables)

This paper contains 11 sections, 1 theorem, 22 equations, 11 figures, 3 tables.

Introduction
Other Related works
Adversarial training and induced distributions
Training on the induced distributions
Generalization properties of the induced distributions
Other observations and discussions
Conclusion
Detailed Experimental setup
Proof of the theorem
Local dispersion results across other datasets
Additional results for section 6

Key Result

Theorem 5.1

Let $f \in {\cal F}$ and suppose that $f$ satisfies the above conditions. Then for any $\tau>0$, with probability at least $1-\tau$ over the i.i.d. draws of sample $\{(v_i, y_i)\}_{i=1}^{m}$ from $\tilde{\mathcal{D}}_{t}$,

Figures (11)

Figure 1: PGD-AT and the corresponding IDE results on the CIFAR-10 dataset with the Wide ResNet34 (WRN-34) model DBLP:journals/corr/ZagoruykoK16. A significant increase in the IDE testing error is observed with the appearance of robust overfitting, suggesting a correlation between the generalization difficulty on $\tilde{\mathcal{D}}_t$ and robust overfitting.
Figure 2: PGD-AT and the corresponding IDE results across different datasets. The behaviour of the red curves matches that of the yellow curve, as we observe that in (a)-(c) a substantial rise in IDE testing errors concurrent with the emergence of robust overfitting and that in the sub-figure (d) the absence of robust overfitting also coincides with the consistently low IDE testing error. These demonstrate a compelling correlation between these two quantities.
Figure 3: The outcomes of additional experiments conducted on CIFAR-10. In the experiments, we perform PGD-AT with various weight decay rates and conduct IDEs for each of the PGD-AT variant. The blue curves are reproduced from Figure \ref{['fig:IDE cifar10']}, serving as a reference for a clear comparison. The results further solidify the correlation between the robust overfitting and the IDE testing error.
Figure 4: Local dispersion measured on the CIFAR-10 and CIFAR-100 testing set. (a) and (b): histograms of $\tilde{\gamma}_{t}(x,y)$ at three distinct PGD-AT checkpoints. (c) and (d): The evolution of ELD w.r.t $t$ and the IDE testing error for each $\tilde{\mathcal{D}}_t$. The results show that the level of $\tilde{\gamma}_{t}(x,y)$ increases during PGD-AT and correspondingly the model obtained from $\tilde{\mathcal{D}}_t$ becomes harder to generalize. This implies that the local property of $\mathcal{Q}_{x,y,\theta_t}$ characterized by $\tilde{\gamma}_{t}(x,y)$ plays a dominate role in influencing the generalization of $\tilde{\mathcal{D}}_t$.
Figure 5: Experiments on the CIFAR-10 and CIFAR-100 testing set. (a) and (b): histograms of $d_{t}(x,y)$. (c) and (d): histograms of $\Phi_t(x,y)$. (e) and (f): The evolution of $\mathbb{E}_{\mathcal{D}}d_{t}(x,y)$ and $\mathbb{E}_{\mathcal{D}}\Phi_t(x,y)$ along PGD-AT trajectory. Combined with the results in Figure \ref{['fig: disper']}, an interesting phenomenon in PGD-AT is revealed: as the training proceeds, the perturbed data generated by $x$ are getting closer to $x$ and in the meanwhile getting more dispersed potentially due to the spreading of perturbation angles.
...and 6 more figures

Theorems & Definitions (1)

Theorem 5.1

On robust overfitting: adversarial training induced distribution matters

TL;DR

Abstract

On robust overfitting: adversarial training induced distribution matters

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (1)