Sustainable Self-evolution Adversarial Training
Wenxuan Wang, Chenglei Wang, Huihui Qi, Menghao Ye, Xuelin Qian, Peng Wang, Yanning Zhang
TL;DR
This work addresses robustness of deep vision models under continually evolving adversarial attacks in long-term deployments. It introduces Sustainable Self-evolution Adversarial Training (SSEAT), a framework built on a Continual Adversarial Defense (CAD) pipeline that optimizes a continual min–max objective over successive attack sets: $\min_{\theta} E_{(x,y)~D}[ \max_{||\delta||_p \le \epsilon} L_{adv}(\theta, x+\delta, y) ]$. Two supporting modules are proposed to prevent forgetting and preserve clean accuracy: Adversarial Data Replay (ADR) selects diverse rehearsal samples via an uncertainty-based memory buffer, and Consistency Regularization Strategy (CRS) aligns predictions across augmentations and stages using Jensen–Shannon divergence with temperature scaling, as formalized in the blended objective $L_{total}$. Experiments on CIFAR-10/100 show SSEAT achieves superior robustness against unknown attacks while maintaining high accuracy on clean data, demonstrating practical potential for long-term secure deployment.
Abstract
With the wide application of deep neural network models in various computer vision tasks, there has been a proliferation of adversarial example generation strategies aimed at deeply exploring model security. However, existing adversarial training defense models, which rely on single or limited types of attacks under a one-time learning process, struggle to adapt to the dynamic and evolving nature of attack methods. Therefore, to achieve defense performance improvements for models in long-term applications, we propose a novel Sustainable Self-Evolution Adversarial Training (SSEAT) framework. Specifically, we introduce a continual adversarial defense pipeline to realize learning from various kinds of adversarial examples across multiple stages. Additionally, to address the issue of model catastrophic forgetting caused by continual learning from ongoing novel attacks, we propose an adversarial data replay module to better select more diverse and key relearning data. Furthermore, we design a consistency regularization strategy to encourage current defense models to learn more from previously trained ones, guiding them to retain more past knowledge and maintain accuracy on clean samples. Extensive experiments have been conducted to verify the efficacy of the proposed SSEAT defense method, which demonstrates superior defense performance and classification accuracy compared to competitors.Code is available at https://github.com/aup520/SSEAT
