Revisiting Adversarial Training under Long-Tailed Distributions

Xinli Yue; Ningping Mou; Qian Wang; Lingchen Zhao

Revisiting Adversarial Training under Long-Tailed Distributions

Xinli Yue, Ningping Mou, Qian Wang, Lingchen Zhao

TL;DR

This work reevaluates adversarial training under long-tailed distributions, showing that AT-BSL can match the performance of the RoBal framework with much lower training overhead. It reveals that, akin to balanced data, long-tailed adversarial training suffers from robust overfitting, but data augmentation—particularly diverse augmentation strategies—can alleviate overfitting and significantly boost robustness, with gains attributed to increased example diversity. The authors validate that combining Balanced Softmax Loss with data augmentation yields substantial improvements (e.g., +6.66% AutoAttack robustness on CIFAR-10-LT with WRN-34-10) and generalize these findings across multiple architectures and datasets, offering practical guidance for real-world long-tailed robustness. Overall, the paper provides a principled, efficient path to robust adversarial performance in imbalanced settings by prioritizing BSL and diverse augmentation.

Abstract

Deep neural networks are vulnerable to adversarial attacks, often leading to erroneous outputs. Adversarial training has been recognized as one of the most effective methods to counter such attacks. However, existing adversarial training techniques have predominantly been tested on balanced datasets, whereas real-world data often exhibit a long-tailed distribution, casting doubt on the efficacy of these methods in practical scenarios. In this paper, we delve into adversarial training under long-tailed distributions. Through an analysis of the previous work "RoBal", we discover that utilizing Balanced Softmax Loss alone can achieve performance comparable to the complete RoBal approach while significantly reducing training overheads. Additionally, we reveal that, similar to uniform distributions, adversarial training under long-tailed distributions also suffers from robust overfitting. To address this, we explore data augmentation as a solution and unexpectedly discover that, unlike results obtained with balanced data, data augmentation not only effectively alleviates robust overfitting but also significantly improves robustness. We further investigate the reasons behind the improvement of robustness through data augmentation and identify that it is attributable to the increased diversity of examples. Extensive experiments further corroborate that data augmentation alone can significantly improve robustness. Finally, building on these findings, we demonstrate that compared to RoBal, the combination of BSL and data augmentation leads to a +6.66% improvement in model robustness under AutoAttack on CIFAR-10-LT. Our code is available at https://github.com/NISPLab/AT-BSL .

Revisiting Adversarial Training under Long-Tailed Distributions

TL;DR

Abstract

Paper Structure (32 sections, 8 equations, 9 figures, 22 tables)

This paper contains 32 sections, 8 equations, 9 figures, 22 tables.

Introduction
Related Works
Analysis of RoBal
Preliminaries
Ablation Studies of RoBal
Robust Overfitting and Unexpected Discovery
Why Data Augmentation Can Improve Robustness
Experiments
Settings
Main Results
Futher Analysis
Conclusion
Implementation Details of Experiments
Details of Table \ref{['tab_robal_ablation']}
Details of Data Augmentaions
...and 17 more sections

Figures (9)

Figure 1: The clean accuracy and robustness under AutoAttack (AA) croce2020reliable of various adversarial training methods using WideResNet-34-10 zagoruyko2016wide on CIFAR-10-LT krizhevsky2009learning. Our method, building upon AT madry2018towards and BSL ren2020balanced, leverages data augmentation to improve robustness, achieving a +6.66% improvement over the SOTA method RoBal wu2021adversarial. REAT li2023adversarial is a concurrent work with ours, yet to be published.
Figure 2: Learning rate scheduling analysis of RoBal wu2021adversarial. (a) comparison of the learning rate schedules: 'RoBal Code Schedule’ from the source code and 'RoBal Paper Schedule’ as described in the publication. (b) the evolution of test robustness under PGD-20 madry2018towards using ResNet-18 on CIFAR-10-LT across training epochs.
Figure 3: The evolution of test robustness under PGD-20 using ResNet-18 on CIFAR-10-LT for AT-BSL using different data augmentation strategies across training epochs. For reference, the red dashed lines in each panel represent the robustness of the best checkpoint of AT-BSL. Due to the density of the illustrations, the results have been compartmentalized into four distinct panels: (a), (b), (c), and (d).
Figure 4: The robustness under AA for AT-BSL with different augmentations using ResNet-18 on CIFAR-10-LT. (a) Change the augmentation space of RA cubuk2020randaugment to a single augmentation, and the horizontal axis represents the name of the single augmentation. (b) The horizontal axis represents the number of types of augmentations in the search space of RA.
Figure 5: The class-wise example number and robustness under AA for various algorithms on CIFAR-10-LT at the best checkpoint. (a) ResNet-18; (b) WideResNet-34-10.
...and 4 more figures

Revisiting Adversarial Training under Long-Tailed Distributions

TL;DR

Abstract

Revisiting Adversarial Training under Long-Tailed Distributions

Authors

TL;DR

Abstract

Table of Contents

Figures (9)