Improving Model Generalization by On-manifold Adversarial Augmentation in the Frequency Domain

Chang Liu; Wenzhao Xiang; Yuan He; Hui Xue; Shibao Zheng; Hang Su

Improving Model Generalization by On-manifold Adversarial Augmentation in the Frequency Domain

Chang Liu, Wenzhao Xiang, Yuan He, Hui Xue, Shibao Zheng, Hang Su

TL;DR

This work tackles the challenge of deep models failing to generalize to out-of-distribution data. It introduces AdvWavAug, an on-manifold adversarial augmentation in the frequency domain based on wavelet transforms, integrated with AdvProp to encourage robustness within the data manifold. The authors provide a theoretical upper bound linking OOD generalization to on-manifold adversarial robustness and demonstrate substantial empirical improvements on ImageNet and its distorted variants, achieving state-of-the-art results for several transformer architectures. The approach is efficient, avoids heavy manifold estimation via VAEs, and remains compatible with other augmentation strategies and self-supervised pretraining like MAE. The work thus offers a practical path to firmer OOD generalization through semantically meaningful adversarial perturbations.

Abstract

Deep neural networks (DNNs) may suffer from significantly degenerated performance when the training and test data are of different underlying distributions. Despite the importance of model generalization to out-of-distribution (OOD) data, the accuracy of state-of-the-art (SOTA) models on OOD data can plummet. Recent work has demonstrated that regular or off-manifold adversarial examples, as a special case of data augmentation, can be used to improve OOD generalization. Inspired by this, we theoretically prove that on-manifold adversarial examples can better benefit OOD generalization. Nevertheless, it is nontrivial to generate on-manifold adversarial examples because the real manifold is generally complex. To address this issue, we proposed a novel method of Augmenting data with Adversarial examples via a Wavelet module (AdvWavAug), an on-manifold adversarial data augmentation technique that is simple to implement. In particular, we project a benign image into a wavelet domain. With the assistance of the sparsity characteristic of wavelet transformation, we can modify an image on the estimated data manifold. We conduct adversarial augmentation based on AdvProp training framework. Extensive experiments on different models and different datasets, including ImageNet and its distorted versions, demonstrate that our method can improve model generalization, especially on OOD data. By integrating AdvWavAug into the training process, we have achieved SOTA results on some recent transformer-based models.

Improving Model Generalization by On-manifold Adversarial Augmentation in the Frequency Domain

TL;DR

Abstract

Paper Structure (36 sections, 2 theorems, 25 equations, 14 figures, 10 tables, 1 algorithm)

This paper contains 36 sections, 2 theorems, 25 equations, 14 figures, 10 tables, 1 algorithm.

Introduction
Background
Transform-based Augmentations
Adversary-based Augmentations
On-manifold Adversarial Augmentation in the Frequency Domain
On-manifold Augmentation Improving OOD Generalization
Manifold Representation in the Frequency Domain
Boosting Model Generalization via AdvWavAug
Problem Formulation
Implementation Details
Experiments
Experiments Setup
Datasets
Architectures
Augmentation Module
...and 21 more sections

Key Result

Lemma 1

For $\bm{\theta}$ and $\epsilon$, we have in which, $\bm{x} \in \mathbb{R}^{d_0}$, $\bm{z}=g(\bm{x}) \in \mathbb{R}^{d}$ and $\epsilon_z=\sup_{\|\bm{\delta}_x\|_\infty\leq \epsilon}\|g(\bm{x})-g(\bm{x}+\bm{\delta}_x)\|_\infty$.

Figures (14)

Figure 1: Boosting model generalization with on-manifold adversarial examples in the frequency domain. The OOD data (denoted as a green triangle) resides on a manifold, which is connected but distribution shifted from the original data manifold. An off-manifold adversarial example (denoted as a red point) moves outside the manifold, while an on-manifold one (denoted as a blue box) remains on the manifold. The on-manifold perturbations are more closely aligned with the semantic meaning in the wavelet domain. Extensive results, such as Swin Transformer Small liu2021swin, demonstrate that our AdvWavAug significantly improves model generalization on ImageNet russakovsky2015imagenet and its distorted versions across various backbone models.
Figure 2: Illustration of on-manifold and off-manifold adversarial examples in the wavelet domain. Given the original image on the data manifold (denoted as a yellow star), a PGD attacker generates an off-manifold adversarial example (denoted as a red point) with a perturbation $\bm{\delta}_{pgd}$; our method generates an on-manifold adversarial example (denoted as a blue box) with a perturbation $\bm{\delta}_{f}$ parallel to the manifold. The wavelet decomposition of these two perturbations demonstrates that $\bm{\delta}_{f}$ is more related to the semantic meaning, while $\bm{\delta}_{pgd}$ is noise-like.
Figure 3: The overall pipeline of adversarial data augmentation with AdvWavAug, yielding an improved model generalization. To obtain on-manifold adversarial examples $\bm{x}^{adv}$, we send the input image $\bm{x}$ into an adversarial augmentation module, in which $\bm{x}$ is projected into the wavelet domain by wavelet transformation $\mathcal{W}$ and its inverse $\mathcal{W}^{-1}$. We design an attention map to receive the perturbations backpropagated from the loss. The wavelet decomposition of the perturbations ensures that AdvWavAug generates on-manifold adversarial examples. During model training, the original input $\bm{x}$ is augmented with the adversarial examples $\bm{x}^{adv}$, and the augmented data is used to train the target model.
Figure 4: Generalization comparison on ImageNet. AdvWavAug boosts model performance over the original AdvProp with PGD attacker on ImageNet. Improvement to larger models is more significant. Our best result is based on Swin-Small (Swin-S), i.e., 81.6% Top-1 Acc. on ImageNet.
Figure 5: Comparison of computation cost with AdvWavAug and VQ-VAE augmentation based on Swin Transformer Small.
...and 9 more figures

Theorems & Definitions (5)

Lemma 1
proof
Theorem 1: OOD Upper-bound for Models Robust on On-manifold Adversarial Examples
Definition 1: wainwright2019high
proof

Improving Model Generalization by On-manifold Adversarial Augmentation in the Frequency Domain

TL;DR

Abstract

Improving Model Generalization by On-manifold Adversarial Augmentation in the Frequency Domain

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (5)