Table of Contents
Fetching ...

Boosting the Transferability of Adversarial Attacks with Global Momentum Initialization

Jiafeng Wang, Zhaoyu Chen, Kaixun Jiang, Dingkang Yang, Lingyi Hong, Pinxue Guo, Haijing Guo, Wenqiang Zhang

TL;DR

This work addresses the limited transferability of adversarial examples under defenses by analyzing gradient consistency and introducing Global Momentum Initialization (GI). GI employs gradient pre-convergence and a global search to mitigate gradient elimination and improve momentum convergence, enabling stronger transfer attacks that integrate with existing gradient-based methods and input-transformations. Empirical results show GI yields notable gains across image and video domains, achieving average attack successes up to 95.4% against advanced defenses and approaching white-box performance with ensembles, while remaining more time-efficient than prior high-cost approaches. Overall, GI provides a practical, versatile baseline that enhances black-box transfer attacks and offers a new lens on optimizing adversarial perturbations via gradient consistency.

Abstract

Deep Neural Networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding human-imperceptible perturbations to the benign inputs. Simultaneously, adversarial examples exhibit transferability across models, enabling practical black-box attacks. However, existing methods are still incapable of achieving the desired transfer attack performance. In this work, focusing on gradient optimization and consistency, we analyse the gradient elimination phenomenon as well as the local momentum optimum dilemma. To tackle these challenges, we introduce Global Momentum Initialization (GI), providing global momentum knowledge to mitigate gradient elimination. Specifically, we perform gradient pre-convergence before the attack and a global search during this stage. GI seamlessly integrates with existing transfer methods, significantly improving the success rate of transfer attacks by an average of 6.4% under various advanced defense mechanisms compared to the state-of-the-art method. Ultimately, GI demonstrates strong transferability in both image and video attack domains. Particularly, when attacking advanced defense methods in the image domain, it achieves an average attack success rate of 95.4%. The code is available at $\href{https://github.com/Omenzychen/Global-Momentum-Initialization}{https://github.com/Omenzychen/Global-Momentum-Initialization}$.

Boosting the Transferability of Adversarial Attacks with Global Momentum Initialization

TL;DR

This work addresses the limited transferability of adversarial examples under defenses by analyzing gradient consistency and introducing Global Momentum Initialization (GI). GI employs gradient pre-convergence and a global search to mitigate gradient elimination and improve momentum convergence, enabling stronger transfer attacks that integrate with existing gradient-based methods and input-transformations. Empirical results show GI yields notable gains across image and video domains, achieving average attack successes up to 95.4% against advanced defenses and approaching white-box performance with ensembles, while remaining more time-efficient than prior high-cost approaches. Overall, GI provides a practical, versatile baseline that enhances black-box transfer attacks and offers a new lens on optimizing adversarial perturbations via gradient consistency.

Abstract

Deep Neural Networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding human-imperceptible perturbations to the benign inputs. Simultaneously, adversarial examples exhibit transferability across models, enabling practical black-box attacks. However, existing methods are still incapable of achieving the desired transfer attack performance. In this work, focusing on gradient optimization and consistency, we analyse the gradient elimination phenomenon as well as the local momentum optimum dilemma. To tackle these challenges, we introduce Global Momentum Initialization (GI), providing global momentum knowledge to mitigate gradient elimination. Specifically, we perform gradient pre-convergence before the attack and a global search during this stage. GI seamlessly integrates with existing transfer methods, significantly improving the success rate of transfer attacks by an average of 6.4% under various advanced defense mechanisms compared to the state-of-the-art method. Ultimately, GI demonstrates strong transferability in both image and video attack domains. Particularly, when attacking advanced defense methods in the image domain, it achieves an average attack success rate of 95.4%. The code is available at .
Paper Structure (17 sections, 12 equations, 5 figures, 8 tables, 1 algorithm)

This paper contains 17 sections, 12 equations, 5 figures, 8 tables, 1 algorithm.

Figures (5)

  • Figure 1: Pipeline of the generation and transfer process of adversarial examples. The module on the left illustrates example generation utilizing the surrogate model for $T$ iterative attacks, while the module on the right delineates the transfer of the generated adversarial examples into the black-box model.
  • Figure 2: (a) shows the effect of the transfer attack at different step sizes. scale_2 and scale_10 show the results of the attack when the step size is enlarged by two times and ten times, respectively. (b) shows the adversarial examples under different attack methods. All the adversarial examples are generated with Inc-v3.
  • Figure 3: Visualisation of the attack process. CT DBLP:journals/corr/abs-1908-06281 represents the combination of three transformation methods: DIM xie2019improving, TIM dong2019evading, SIM DBLP:journals/corr/abs-1908-06281. $x$ represents the original data point and $x^{CT}$ represents the adversarial example obtained under original CT-FGSM. $x^{pre}$ and $x^{GI-CT}$ denote the overall process of the pre-attack as well as the formal attack.
  • Figure 4: Analysis of gradient consistency (cosine similarity) between iterative attacks. (a) represents the gradient consistency before momentum accumulation; (b) represents the gradient consistency between iterations after momentum accumulation; (c) shows the gradient similarity analysis between the $1^{st}$, $10^{th}$ iteration and other iterations. For instance, when the convergence distance is 2, the gradient consistency between the $1^{st}$ and $3^{rd}$ rounds is nearly 0.3 smaller than that between the $10^{th}$ and $8^{th}$ rounds. However, with our approach, the difference is less than 0.1. The reduced difference suggests that the degree of consistency between the gradients in the $1^{st}$ round and those in the other rounds is very similar to the consistency between the gradients in the $10^{th}$ round and those in the other rounds. This implies a more adequate convergence.
  • Figure 5: Ablation experiments on pre-convergence and global search factors. (a) and (b) represent attack success rate for six black box models using different pre-convergence iterations. (c) represents the average attack success rate of six models with different global search factors.