Rapid Plug-in Defenders

Kai Wu; Yujian Betterest Li; Jian Lou; Xiaoyu Zhang; Handing Wang; Jing Liu

Rapid Plug-in Defenders

Kai Wu, Yujian Betterest Li, Jian Lou, Xiaoyu Zhang, Handing Wang, Jing Liu

TL;DR

This paper proposes a novel method termed CeTaD (Considering Pre-trained Transformers as Defenders) for RaPiD, optimized for efficient computation, which is capable of rapidly adapting to various attacks and different application scenarios without altering the target model and clean training data.

Abstract

In the realm of daily services, the deployment of deep neural networks underscores the paramount importance of their reliability. However, the vulnerability of these networks to adversarial attacks, primarily evasion-based, poses a concerning threat to their functionality. Common methods for enhancing robustness involve heavy adversarial training or leveraging learned knowledge from clean data, both necessitating substantial computational resources. This inherent time-intensive nature severely limits the agility of large foundational models to swiftly counter adversarial perturbations. To address this challenge, this paper focuses on the Rapid Plug-in Defender (RaPiD) problem, aiming to rapidly counter adversarial perturbations without altering the deployed model. Drawing inspiration from the generalization and the universal computation ability of pre-trained transformer models, we propose a novel method termed CeTaD (Considering Pre-trained Transformers as Defenders) for RaPiD, optimized for efficient computation. CeTaD strategically fine-tunes the normalization layer parameters within the defender using a limited set of clean and adversarial examples. Our evaluation centers on assessing CeTaD's effectiveness, transferability, and the impact of different components in scenarios involving one-shot adversarial examples. The proposed method is capable of rapidly adapting to various attacks and different application scenarios without altering the target model and clean training data. We also explore the influence of varying training data conditions on CeTaD's performance. Notably, CeTaD exhibits adaptability across differentiable service models and proves the potential of continuous learning.

Rapid Plug-in Defenders

TL;DR

Abstract

Paper Structure (23 sections, 3 equations, 5 figures, 18 tables)

This paper contains 23 sections, 3 equations, 5 figures, 18 tables.

Introduction
Related Work
Pre-trained Transformers as Defenders
Experiments
Experimental Setup
CeTaD versus Baselines
Generalization of CeTaD on Different Attacks
Zero-shot Transfer to Different Datasets
Effect of Pre-trained Models on CeTaD
Discussion on Convergence and Overfitting
Role of Pre-trained Initialization and Frozen Parameters
Effect of Training Data on CeTaD
Black-Box Attack
Discussion: Limitations and Future Work
Conclusion
...and 8 more sections

Figures (5)

Figure 1: The structure of CeTaD. The input example would be added with the feature obtained by the stack of an embedding, a transformer encoder, and a decoder before being processed by the deployed service model. The deployed model is frozen in RaPiD.
Figure 2: Accuracy and loss over epochs. Top: Accuracy curves representing training and test data. Training refers to accuracy on training data, while Test denotes accuracy on test data. It's notable that clean training data remains unseen during training. Bottom: The loss curve depicting training data. Given the consistent 100% accuracy after 90 epochs, this loss curve provides insights into the training process.
Figure 3: Comparison between previous adversarial training methods and ours: (a) Previous methods heavily rely on vast adversarial examples to tune the original model, demanding significant time and computational resources. (b) In contrast, our approach focuses on tuning only a subset of parameters within the plug-in defender block using limited adversarial examples, enabling swift impact without exhaustive computational demands.
Figure 4: Accuracy and loss vs. epoch with seed 41.
Figure 5: Accuracy and loss vs. epoch with seed 43.

Rapid Plug-in Defenders

TL;DR

Abstract

Rapid Plug-in Defenders

Authors

TL;DR

Abstract

Table of Contents

Figures (5)