Table of Contents
Fetching ...

Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization

Yanting Gao, Yepeng Liu, Junming Liu, Qi Zhang, Hongyun Zhang, Duoqian Miao, Cairong Zhao

TL;DR

This work tackles the problem of adversarial overfitting in transfer-based attacks on Vision Transformers by proposing a twofold strategy: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs mid-to-low frequency components to emphasize features shared across ViTs trained on the same data, while IS adaptively suppresses surrogate-specific gradient directions, particularly in the qkv module. The resulting Commonality-Oriented Gradient Optimization (COGO) produces perturbations that align with shared model decision patterns and avoid surrogate biases, yielding substantial transferability gains over state-of-the-art methods across ViTs and CNNs. The approach is validated through extensive experiments, ablations, and gradient-dispersion analyses, demonstrating practical improvements for evaluating and potentially improving ViT robustness in black-box settings.

Abstract

Exploring effective and transferable adversarial examples is vital for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated from surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbation inputs or applying uniform gradient regularization within surrogate models, yet they have not fully leveraged the shared and unique features of surrogate models trained on the same task, leading to suboptimal transfer performance. Therefore, enhancing perturbations of common information shared by surrogate models and suppressing those tied to individual characteristics offers an effective way to improve transferability. Accordingly, we propose a commonality-oriented gradient optimization strategy (COGO) consisting of two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs the mid-to-low frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to evaluate the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rates of adversarial attacks, outperforming current state-of-the-art methods.

Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization

TL;DR

This work tackles the problem of adversarial overfitting in transfer-based attacks on Vision Transformers by proposing a twofold strategy: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs mid-to-low frequency components to emphasize features shared across ViTs trained on the same data, while IS adaptively suppresses surrogate-specific gradient directions, particularly in the qkv module. The resulting Commonality-Oriented Gradient Optimization (COGO) produces perturbations that align with shared model decision patterns and avoid surrogate biases, yielding substantial transferability gains over state-of-the-art methods across ViTs and CNNs. The approach is validated through extensive experiments, ablations, and gradient-dispersion analyses, demonstrating practical improvements for evaluating and potentially improving ViT robustness in black-box settings.

Abstract

Exploring effective and transferable adversarial examples is vital for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated from surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbation inputs or applying uniform gradient regularization within surrogate models, yet they have not fully leveraged the shared and unique features of surrogate models trained on the same task, leading to suboptimal transfer performance. Therefore, enhancing perturbations of common information shared by surrogate models and suppressing those tied to individual characteristics offers an effective way to improve transferability. Accordingly, we propose a commonality-oriented gradient optimization strategy (COGO) consisting of two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs the mid-to-low frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to evaluate the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rates of adversarial attacks, outperforming current state-of-the-art methods.

Paper Structure

This paper contains 38 sections, 12 equations, 7 figures, 9 tables, 1 algorithm.

Figures (7)

  • Figure 1: A comparison of adversarial examples and gradient sensitivity maps generated for the same input image across different models (ViT-base and Visformer). Specifically, (a) and (c) show the adversarial examples generated by ViT-base and Visformer, while (b) and (d) show their corresponding gradient sensitivity maps.
  • Figure 2: An iteration of adversarial example generation under the COGO strategy, which enhances mid-to-low frequency components via Commonality Enhancement (CE) and suppresses surrogate-specific gradients via Individuality Suppression (IS) to improve transferability.
  • Figure 3: (a) shows the original input, while (b) illustrates the energy intensity distribution in the frequency domain. It can be observed that the low-frequency components, typically located near the center of the spectrum, exhibit stronger energy compared to the high-frequency components.
  • Figure 4: Changes in the gradient dispersion indicator across DeiT blocks before and after applying COGO. Larger values reflect more uniform gradient distributions.
  • Figure 5: Adversarial attack examples at different N
  • ...and 2 more figures