Table of Contents
Fetching ...

Sustainable Self-evolution Adversarial Training

Wenxuan Wang, Chenglei Wang, Huihui Qi, Menghao Ye, Xuelin Qian, Peng Wang, Yanning Zhang

TL;DR

This work addresses robustness of deep vision models under continually evolving adversarial attacks in long-term deployments. It introduces Sustainable Self-evolution Adversarial Training (SSEAT), a framework built on a Continual Adversarial Defense (CAD) pipeline that optimizes a continual min–max objective over successive attack sets: $\min_{\theta} E_{(x,y)~D}[ \max_{||\delta||_p \le \epsilon} L_{adv}(\theta, x+\delta, y) ]$. Two supporting modules are proposed to prevent forgetting and preserve clean accuracy: Adversarial Data Replay (ADR) selects diverse rehearsal samples via an uncertainty-based memory buffer, and Consistency Regularization Strategy (CRS) aligns predictions across augmentations and stages using Jensen–Shannon divergence with temperature scaling, as formalized in the blended objective $L_{total}$. Experiments on CIFAR-10/100 show SSEAT achieves superior robustness against unknown attacks while maintaining high accuracy on clean data, demonstrating practical potential for long-term secure deployment.

Abstract

With the wide application of deep neural network models in various computer vision tasks, there has been a proliferation of adversarial example generation strategies aimed at deeply exploring model security. However, existing adversarial training defense models, which rely on single or limited types of attacks under a one-time learning process, struggle to adapt to the dynamic and evolving nature of attack methods. Therefore, to achieve defense performance improvements for models in long-term applications, we propose a novel Sustainable Self-Evolution Adversarial Training (SSEAT) framework. Specifically, we introduce a continual adversarial defense pipeline to realize learning from various kinds of adversarial examples across multiple stages. Additionally, to address the issue of model catastrophic forgetting caused by continual learning from ongoing novel attacks, we propose an adversarial data replay module to better select more diverse and key relearning data. Furthermore, we design a consistency regularization strategy to encourage current defense models to learn more from previously trained ones, guiding them to retain more past knowledge and maintain accuracy on clean samples. Extensive experiments have been conducted to verify the efficacy of the proposed SSEAT defense method, which demonstrates superior defense performance and classification accuracy compared to competitors.Code is available at https://github.com/aup520/SSEAT

Sustainable Self-evolution Adversarial Training

TL;DR

This work addresses robustness of deep vision models under continually evolving adversarial attacks in long-term deployments. It introduces Sustainable Self-evolution Adversarial Training (SSEAT), a framework built on a Continual Adversarial Defense (CAD) pipeline that optimizes a continual min–max objective over successive attack sets: . Two supporting modules are proposed to prevent forgetting and preserve clean accuracy: Adversarial Data Replay (ADR) selects diverse rehearsal samples via an uncertainty-based memory buffer, and Consistency Regularization Strategy (CRS) aligns predictions across augmentations and stages using Jensen–Shannon divergence with temperature scaling, as formalized in the blended objective . Experiments on CIFAR-10/100 show SSEAT achieves superior robustness against unknown attacks while maintaining high accuracy on clean data, demonstrating practical potential for long-term secure deployment.

Abstract

With the wide application of deep neural network models in various computer vision tasks, there has been a proliferation of adversarial example generation strategies aimed at deeply exploring model security. However, existing adversarial training defense models, which rely on single or limited types of attacks under a one-time learning process, struggle to adapt to the dynamic and evolving nature of attack methods. Therefore, to achieve defense performance improvements for models in long-term applications, we propose a novel Sustainable Self-Evolution Adversarial Training (SSEAT) framework. Specifically, we introduce a continual adversarial defense pipeline to realize learning from various kinds of adversarial examples across multiple stages. Additionally, to address the issue of model catastrophic forgetting caused by continual learning from ongoing novel attacks, we propose an adversarial data replay module to better select more diverse and key relearning data. Furthermore, we design a consistency regularization strategy to encourage current defense models to learn more from previously trained ones, guiding them to retain more past knowledge and maintain accuracy on clean samples. Extensive experiments have been conducted to verify the efficacy of the proposed SSEAT defense method, which demonstrates superior defense performance and classification accuracy compared to competitors.Code is available at https://github.com/aup520/SSEAT

Paper Structure

This paper contains 16 sections, 9 equations, 3 figures, 8 tables, 1 algorithm.

Figures (3)

  • Figure 1: A conceptual overview of our Sustainable Self-evolution Adversarial Training (SSEAT) method. When confronted with the challenge of ongoing generated new adversarial examples in complex and long-term multimedia applications, existing adversarial training methods struggle to adapt to iteratively updated attack methods. In contrast, our SSEAT model achieves sustainable defense performance improvements by continuously absorbing new adversarial knowledge.
  • Figure 2: (a) Illustration of our Continual Adversarial Defense (CAD) pipeline. CAD helps the model to learn from new kinds of attacks in multiple stages continuously. (b) Illustration of our Adversarial Data Replay (ADR) module. ADR guides the model to select diverse and representative replay data to alleviate the catastrophic forgetting issue. (c) Illustration of our Consistency Regularization Strategy (CRS) component. CRS encourages the model to learn more from the historically trained models to maintain classification accuracy.
  • Figure 3: Visualization of clean examples representations for CAD and SSEAT by using t-SNElinderman2017efficient. We use 1000 test images from CFAIR-10 dataset for visualization. Different colors represent different categories.