AROID: Improving Adversarial Robustness Through Online Instance-Wise Data Augmentation
Lin Li, Jianing Qiu, Michael Spratling
TL;DR
AROID introduces an automated online instance-wise data augmentation framework to bolster adversarial robustness in adversarial training by learning per-sample augmentation policies through a bi-level optimization. It relies on three guiding objectives—Vulnerability, Affinity, and Diversity—to shape augmentation hardness and policy exploration, with gradient estimation via REINFORCE to handle non-differentiable augmentation steps. Across CIFAR, Imagenette, and ImageNet, AROID achieves state-of-the-art robustness and accuracy, significantly reducing robust overfitting and offering substantial efficiency gains over previous automated DA methods. The method is versatile, enabling online or offline policy learning and compatibility with existing robust training techniques, thereby offering a practical tool for enhancing AT in diverse settings.
Abstract
Deep neural networks are vulnerable to adversarial examples. Adversarial training (AT) is an effective defense against adversarial examples. However, AT is prone to overfitting which degrades robustness substantially. Recently, data augmentation (DA) was shown to be effective in mitigating robust overfitting if appropriately designed and optimized for AT. This work proposes a new method to automatically learn online, instance-wise, DA policies to improve robust generalization for AT. This is the first automated DA method specific for robustness. A novel policy learning objective, consisting of Vulnerability, Affinity and Diversity, is proposed and shown to be sufficiently effective and efficient to be practical for automatic DA generation during AT. Importantly, our method dramatically reduces the cost of policy search from the 5000 hours of AutoAugment and the 412 hours of IDBH to 9 hours, making automated DA more practical to use for adversarial robustness. This allows our method to efficiently explore a large search space for a more effective DA policy and evolve the policy as training progresses. Empirically, our method is shown to outperform all competitive DA methods across various model architectures and datasets. Our DA policy reinforced vanilla AT to surpass several state-of-the-art AT methods regarding both accuracy and robustness. It can also be combined with those advanced AT methods to further boost robustness. Code and pre-trained models are available at https://github.com/TreeLLi/AROID.
