Table of Contents
Fetching ...

Boosting Adversarial Transferability via Ensemble Non-Attention

Yipeng Zou, Qin Liu, Jie Wu, Yu Peng, Guo Chen, Hui Zhou, Guanghui Ye

TL;DR

This work designs a novel ensemble attack, NAMEA, which for the first time integrates the gradients from the non-attention areas of ensemble models into the iterative gradient optimization process, and pioneer a new way of decoupling the gradients of non-attention areas from those of attention areas while merging gradients by meta-learning.

Abstract

Ensemble attacks integrate the outputs of surrogate models with diverse architectures, which can be combined with various gradient-based attacks to improve adversarial transferability. However, previous work shows unsatisfactory attack performance when transferring across heterogeneous model architectures. The main reason is that the gradient update directions of heterogeneous surrogate models differ widely, making it hard to reduce the gradient variance of ensemble models while making the best of individual model. To tackle this challenge, we design a novel ensemble attack, NAMEA, which for the first time integrates the gradients from the non-attention areas of ensemble models into the iterative gradient optimization process. Our design is inspired by the observation that the attention areas of heterogeneous models vary sharply, thus the non-attention areas of ViTs are likely to be the focus of CNNs and vice versa. Therefore, we merge the gradients respectively from the attention and non-attention areas of ensemble models so as to fuse the transfer information of CNNs and ViTs. Specifically, we pioneer a new way of decoupling the gradients of non-attention areas from those of attention areas, while merging gradients by meta-learning. Empirical evaluations on ImageNet dataset indicate that NAMEA outperforms AdaEA and SMER, the state-of-the-art ensemble attacks by an average of 15.0% and 9.6%, respectively. This work is the first attempt to explore the power of ensemble non-attention in boosting cross-architecture transferability, providing new insights into launching ensemble attacks.

Boosting Adversarial Transferability via Ensemble Non-Attention

TL;DR

This work designs a novel ensemble attack, NAMEA, which for the first time integrates the gradients from the non-attention areas of ensemble models into the iterative gradient optimization process, and pioneer a new way of decoupling the gradients of non-attention areas from those of attention areas while merging gradients by meta-learning.

Abstract

Ensemble attacks integrate the outputs of surrogate models with diverse architectures, which can be combined with various gradient-based attacks to improve adversarial transferability. However, previous work shows unsatisfactory attack performance when transferring across heterogeneous model architectures. The main reason is that the gradient update directions of heterogeneous surrogate models differ widely, making it hard to reduce the gradient variance of ensemble models while making the best of individual model. To tackle this challenge, we design a novel ensemble attack, NAMEA, which for the first time integrates the gradients from the non-attention areas of ensemble models into the iterative gradient optimization process. Our design is inspired by the observation that the attention areas of heterogeneous models vary sharply, thus the non-attention areas of ViTs are likely to be the focus of CNNs and vice versa. Therefore, we merge the gradients respectively from the attention and non-attention areas of ensemble models so as to fuse the transfer information of CNNs and ViTs. Specifically, we pioneer a new way of decoupling the gradients of non-attention areas from those of attention areas, while merging gradients by meta-learning. Empirical evaluations on ImageNet dataset indicate that NAMEA outperforms AdaEA and SMER, the state-of-the-art ensemble attacks by an average of 15.0% and 9.6%, respectively. This work is the first attempt to explore the power of ensemble non-attention in boosting cross-architecture transferability, providing new insights into launching ensemble attacks.

Paper Structure

This paper contains 22 sections, 12 equations, 11 figures, 13 tables, 1 algorithm.

Figures (11)

  • Figure 1: Attention heatmaps and classification accuracies of clean and masked images. A masked image is crafted by replacing the attention area of ResNet-18 with random noises. Target models include ResNet-50, DeiT-S, and ViT-S.
  • Figure 2: The attack direction search strategies of AdaEA, SMER and NAMEA. AdaEA focuses on reducing gradient discrepancies to improve attack effectiveness. SMER leverages model diversity to search the attack direction. NAMEA merges gradients of attention and non-attention areas by meta-learning to obtain a more accurate attack direction.
  • Figure 3: Overview of NAMEA. Left: Meta-gradient optimization process. Attention meta-training updates the gradient $g^{k+1}_{tr}$ based on model's attention areas; Non-attention meta-testing updates the gradient $g^{k+1}_{te}$ based on model's non-attention areas; Final update merges the gradients from meta-training and meta-testing steps to obtain the final gradient $g^{t+1}$. Right: The comparison of perturbation search process. NAMEA can quickly find the optimal direction, avoiding falling into local optimality.
  • Figure 4: Average ASRs (%) of NAMEA under varying threshold. Base: I-FGSM (Left) and MI-FGSM (Right).
  • Figure 5: Left: Ablation study on meta-learning and GSO. Right: Ablation study on padding values in masked areas.
  • ...and 6 more figures