Table of Contents
Fetching ...

Boosting Adversarial Transferability by Block Shuffle and Rotation

Kunyu Wang, Xuanran He, Wenxuan Wang, Xiaosen Wang

TL;DR

This work tackles the transferability gap in adversarial attacks by examining attention heatmaps and showing that existing input-transformations induce model-specific heatmap variability. It introduces Block Shuffle and Rotation (BSR), a transformation that splits images into blocks, shuffles and rotates them, and computes the average gradient over multiple transformed views to destabilize intrinsic image relations and align heatmaps across models. Empirical results on ImageNet demonstrate that BSR substantially improves transferability over state-of-the-art input-transform-based attacks in both single-model and ensemble-model settings, and remains effective against several defenses; it also synergizes with other input transformations. The approach reveals a practical security risk in black-box scenarios and offers a general framework for enhancing transferability by stabilizing cross-model attention patterns, with code available publicly.

Abstract

Adversarial examples mislead deep neural networks with imperceptible perturbations and have brought significant threats to deep learning. An important aspect is their transferability, which refers to their ability to deceive other models, thus enabling attacks in the black-box setting. Though various methods have been proposed to boost transferability, the performance still falls short compared with white-box attacks. In this work, we observe that existing input transformation based attacks, one of the mainstream transfer-based attacks, result in different attention heatmaps on various models, which might limit the transferability. We also find that breaking the intrinsic relation of the image can disrupt the attention heatmap of the original image. Based on this finding, we propose a novel input transformation based attack called block shuffle and rotation (BSR). Specifically, BSR splits the input image into several blocks, then randomly shuffles and rotates these blocks to construct a set of new images for gradient calculation. Empirical evaluations on the ImageNet dataset demonstrate that BSR could achieve significantly better transferability than the existing input transformation based methods under single-model and ensemble-model settings. Combining BSR with the current input transformation method can further improve the transferability, which significantly outperforms the state-of-the-art methods. Code is available at https://github.com/Trustworthy-AI-Group/BSR

Boosting Adversarial Transferability by Block Shuffle and Rotation

TL;DR

This work tackles the transferability gap in adversarial attacks by examining attention heatmaps and showing that existing input-transformations induce model-specific heatmap variability. It introduces Block Shuffle and Rotation (BSR), a transformation that splits images into blocks, shuffles and rotates them, and computes the average gradient over multiple transformed views to destabilize intrinsic image relations and align heatmaps across models. Empirical results on ImageNet demonstrate that BSR substantially improves transferability over state-of-the-art input-transform-based attacks in both single-model and ensemble-model settings, and remains effective against several defenses; it also synergizes with other input transformations. The approach reveals a practical security risk in black-box scenarios and offers a general framework for enhancing transferability by stabilizing cross-model attention patterns, with code available publicly.

Abstract

Adversarial examples mislead deep neural networks with imperceptible perturbations and have brought significant threats to deep learning. An important aspect is their transferability, which refers to their ability to deceive other models, thus enabling attacks in the black-box setting. Though various methods have been proposed to boost transferability, the performance still falls short compared with white-box attacks. In this work, we observe that existing input transformation based attacks, one of the mainstream transfer-based attacks, result in different attention heatmaps on various models, which might limit the transferability. We also find that breaking the intrinsic relation of the image can disrupt the attention heatmap of the original image. Based on this finding, we propose a novel input transformation based attack called block shuffle and rotation (BSR). Specifically, BSR splits the input image into several blocks, then randomly shuffles and rotates these blocks to construct a set of new images for gradient calculation. Empirical evaluations on the ImageNet dataset demonstrate that BSR could achieve significantly better transferability than the existing input transformation based methods under single-model and ensemble-model settings. Combining BSR with the current input transformation method can further improve the transferability, which significantly outperforms the state-of-the-art methods. Code is available at https://github.com/Trustworthy-AI-Group/BSR
Paper Structure (21 sections, 5 equations, 22 figures, 7 tables, 1 algorithm)

This paper contains 21 sections, 5 equations, 22 figures, 7 tables, 1 algorithm.

Figures (22)

  • Figure 3: Attack success rates (%) of various models on the adversarial examples generated by MI-FGSM, BS, BR and BSR, respectively.
  • Figure 4: Attack successful rates (%) of various models on the adversarial examples generated by BSR with various numbers of blocks, number of transformed images and range of the rotation angles. The adversarial examples are crafted on Inc-v3 model and tested on the other six models under the black-box setting.
  • Figure : Raw Image
  • Figure : Raw Image
  • Figure : Raw Image
  • ...and 17 more figures