CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks

Hao Fang; Jiawei Kong; Bin Chen; Tao Dai; Hao Wu; Shu-Tao Xia

CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks

Hao Fang, Jiawei Kong, Bin Chen, Tao Dai, Hao Wu, Shu-Tao Xia

TL;DR

A masked fine-tuning mechanism is proposed to further strengthen the LIP-guided Generative method in attacking a single class, which surpasses existing single-target methods.

Abstract

Transferable targeted adversarial attacks aim to mislead models into outputting adversary-specified predictions in black-box scenarios. Recent studies have introduced \textit{single-target} generative attacks that train a generator for each target class to generate highly transferable perturbations, resulting in substantial computational overhead when handling multiple classes. \textit{Multi-target} attacks address this by training only one class-conditional generator for multiple classes. However, the generator simply uses class labels as conditions, failing to leverage the rich semantic information of the target class. To this end, we design a \textbf{C}LIP-guided \textbf{G}enerative \textbf{N}etwork with \textbf{C}ross-attention modules (CGNC) to enhance multi-target attacks by incorporating textual knowledge of CLIP into the generator. Extensive experiments demonstrate that CGNC yields significant improvements over previous multi-target generative attacks, e.g., a 21.46\% improvement in success rate from ResNet-152 to DenseNet-121. Moreover, we propose a masked fine-tuning mechanism to further strengthen our method in attacking a single class, which surpasses existing single-target methods.

CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks

TL;DR

A masked fine-tuning mechanism is proposed to further strengthen the LIP-guided Generative method in attacking a single class, which surpasses existing single-target methods.

Abstract

Paper Structure (24 sections, 3 equations, 10 figures, 12 tables, 1 algorithm)

This paper contains 24 sections, 3 equations, 10 figures, 12 tables, 1 algorithm.

Introduction
Related Work
Vision-Language Models
Adversarial Attacks
Method
Preliminary
CLIP-Guided Generative Network
Masked Fine-Tuning Mechanism
Experiments
Experimental Settings
Multi-Target Transferability Evaluation
Evaluation on More Scenarios
Ablation Study
Comparison with Single-Target Attacks
Conclusion
...and 9 more sections

Figures (10)

Figure 1: (a) Targeted attacks from 'Panda' to 'Dog'. The left figure illustrates that previous multi-target methods han2019onceyang2022boosting generate perturbations simply conditioned by class indices or one-hot vectors and only learn the classification boundary specific to the surrogate model. In contrast, our method exploits CLIP's meaningful guidance to effectively capture the feature distribution inherent to the target data, thereby essentially boosting the transferability. (b) We directly feed the scaled perturbations generated by both our CGNC and C-GSP yang2022boosting into three black-box classifiers. The results reveal that our generated perturbations achieve significantly higher mean confidence of the target class, demonstrating its superiority in modeling the target feature distribution.
Figure 2: An overview of our proposed architecture of CGNC. Equipped with the three exquisite modules VL-Purifier, F-Encoder, and CA-Decoder, the generator fully leverages the textual representations encoded by CLIP as auxiliary information about the target classes to better probe their data distribution and achieve superior attack effects.
Figure 3: Illustration of the proposed masked fine-tuning mechanism. (a) We fix the condition input with different text prompts to fine-tune the trained conditional generator $G_{w}$ into multiple generators for single-target attacks. (b) The fooling rate of several target classes with Inc-v3 and Res-152 as substitute models respectively. The results indicate that direct fine-tuning yields inadequate results for certain classes due to overfitting. We efficiently resolve this issue via a patch-wise random mask operation.
Figure 4: Visualization results of different input images for different targets. For each text prompt of the target class, the left column shows the perturbation generated by our CGNC while the right column displays the corresponding adversarial examples.
Figure 5: Fooling rates (from Res-152 to VGG-16) in attacking 8 target classes on cross-domain scenarios. We also provide the results on ImageNet as a comparison.
...and 5 more figures

CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks

TL;DR

Abstract

CLIP-Guided Generative Networks for Transferable Targeted Adversarial Attacks

Authors

TL;DR

Abstract

Table of Contents

Figures (10)