Table of Contents
Fetching ...

AROID: Improving Adversarial Robustness Through Online Instance-Wise Data Augmentation

Lin Li, Jianing Qiu, Michael Spratling

TL;DR

AROID introduces an automated online instance-wise data augmentation framework to bolster adversarial robustness in adversarial training by learning per-sample augmentation policies through a bi-level optimization. It relies on three guiding objectives—Vulnerability, Affinity, and Diversity—to shape augmentation hardness and policy exploration, with gradient estimation via REINFORCE to handle non-differentiable augmentation steps. Across CIFAR, Imagenette, and ImageNet, AROID achieves state-of-the-art robustness and accuracy, significantly reducing robust overfitting and offering substantial efficiency gains over previous automated DA methods. The method is versatile, enabling online or offline policy learning and compatibility with existing robust training techniques, thereby offering a practical tool for enhancing AT in diverse settings.

Abstract

Deep neural networks are vulnerable to adversarial examples. Adversarial training (AT) is an effective defense against adversarial examples. However, AT is prone to overfitting which degrades robustness substantially. Recently, data augmentation (DA) was shown to be effective in mitigating robust overfitting if appropriately designed and optimized for AT. This work proposes a new method to automatically learn online, instance-wise, DA policies to improve robust generalization for AT. This is the first automated DA method specific for robustness. A novel policy learning objective, consisting of Vulnerability, Affinity and Diversity, is proposed and shown to be sufficiently effective and efficient to be practical for automatic DA generation during AT. Importantly, our method dramatically reduces the cost of policy search from the 5000 hours of AutoAugment and the 412 hours of IDBH to 9 hours, making automated DA more practical to use for adversarial robustness. This allows our method to efficiently explore a large search space for a more effective DA policy and evolve the policy as training progresses. Empirically, our method is shown to outperform all competitive DA methods across various model architectures and datasets. Our DA policy reinforced vanilla AT to surpass several state-of-the-art AT methods regarding both accuracy and robustness. It can also be combined with those advanced AT methods to further boost robustness. Code and pre-trained models are available at https://github.com/TreeLLi/AROID.

AROID: Improving Adversarial Robustness Through Online Instance-Wise Data Augmentation

TL;DR

AROID introduces an automated online instance-wise data augmentation framework to bolster adversarial robustness in adversarial training by learning per-sample augmentation policies through a bi-level optimization. It relies on three guiding objectives—Vulnerability, Affinity, and Diversity—to shape augmentation hardness and policy exploration, with gradient estimation via REINFORCE to handle non-differentiable augmentation steps. Across CIFAR, Imagenette, and ImageNet, AROID achieves state-of-the-art robustness and accuracy, significantly reducing robust overfitting and offering substantial efficiency gains over previous automated DA methods. The method is versatile, enabling online or offline policy learning and compatibility with existing robust training techniques, thereby offering a practical tool for enhancing AT in diverse settings.

Abstract

Deep neural networks are vulnerable to adversarial examples. Adversarial training (AT) is an effective defense against adversarial examples. However, AT is prone to overfitting which degrades robustness substantially. Recently, data augmentation (DA) was shown to be effective in mitigating robust overfitting if appropriately designed and optimized for AT. This work proposes a new method to automatically learn online, instance-wise, DA policies to improve robust generalization for AT. This is the first automated DA method specific for robustness. A novel policy learning objective, consisting of Vulnerability, Affinity and Diversity, is proposed and shown to be sufficiently effective and efficient to be practical for automatic DA generation during AT. Importantly, our method dramatically reduces the cost of policy search from the 5000 hours of AutoAugment and the 412 hours of IDBH to 9 hours, making automated DA more practical to use for adversarial robustness. This allows our method to efficiently explore a large search space for a more effective DA policy and evolve the policy as training progresses. Empirically, our method is shown to outperform all competitive DA methods across various model architectures and datasets. Our DA policy reinforced vanilla AT to surpass several state-of-the-art AT methods regarding both accuracy and robustness. It can also be combined with those advanced AT methods to further boost robustness. Code and pre-trained models are available at https://github.com/TreeLLi/AROID.
Paper Structure (42 sections, 23 equations, 8 figures, 15 tables, 2 algorithms)

This paper contains 42 sections, 23 equations, 8 figures, 15 tables, 2 algorithms.

Figures (8)

  • Figure 1: An overview of the proposed method (legend in the right column). The top part shows the pipeline for training the policy model, $f_{plc}$, while the bottom illustrates the pipeline for training the target model, $f_{tgt}$. $f_{aft}$ is a model pre-trained on clean data without any augmentation, which is used to measure the distribution shift caused by data augmentation. Please refer to \ref{['sec: method']} for a detailed explanation.
  • Figure 2: An example of the proposed augmentation sampling procedure. The policy model takes an image as input and outputs logit values defining multiple, multinomial, probability distributions corresponding to different sub-policies. A sub-policy code is created by sampling from each of these distributions, and decoded into a sub-policy, i.e., a transformation and its magnitude. These transformations are applied, in sequence, to augment the image.
  • Figure 3: Ablation study of hyper-parameters$\lambda$, $\beta$, $l$, $u$, $T$ and $K$ for CIFAR10 with PRN18 (even rows) and Imagenette with ViT-B/16 (odd rows). The selected value for each hyper-parameter is marked green color.
  • Figure 4: The progression of the three proposed policy learning objectives throughout the AROID training process on CIFAR10 for WRN34-10. Lines are smoothed with a moving average over 5 epochs for improved clarity.
  • Figure 5: Visualization of the learned DA policies, applied to ten images randomly sampled from CIFAR10 training set, for the Flip, Crop, Color/Shape and Dropout types of augmentations. The policy model is resumed from a checkpoint saved at the end of $110^{th}$ epoch when training a WRN34-10 model on CIFAR10 (following the training setting as specified in \ref{['app: experiment setting']}). The sampled ten images are visualized at the bottom in the order of the x-axis in the above bar-charts. The chance of applying no transformation (Identity) is the gap between the colored bar and the top (i.e., score of 1.0). In the Color/Shape group, the probabilities of different magnitudes are not shown separately, but are summed to get the overall probability of a transformation.
  • ...and 3 more figures