Table of Contents
Fetching ...

Towards a Novel Perspective on Adversarial Examples Driven by Frequency

Zhun Zhang, Yi Zeng, Qihe Liu, Shijie Zhou

TL;DR

This paper investigates how adversarial perturbations distribute across frequency components using wavelet packet decomposition, challenging the view of perturbations as purely high- or low-frequency. It reveals that significant perturbations predominantly reside in the high-frequency parts of low-frequency bands and leverages this insight to design a black-box attack that combines perturbations across selected bands, notably ${\tt aaa}$ and the high-frequency components ${\tt daa}$ and ${\tt dad}$. The authors introduce the Normalized Disturbance Visibility (NDV) index to better align imperceptibility with human perception, defined as $NDV = C \frac{\lVert x - x_{adv} \rVert_2}{\lVert x - x_{adv} \rVert_0 + \epsilon}$. Across CIFAR-10/100 and ImageNet-1K with multiple architectures, the proposed method achieves near 99% attack success and demonstrates superior query efficiency compared with baselines, highlighting the value of a frequency-centric view for adversarial robustness and defense design.

Abstract

Enhancing our understanding of adversarial examples is crucial for the secure application of machine learning models in real-world scenarios. A prevalent method for analyzing adversarial examples is through a frequency-based approach. However, existing research indicates that attacks designed to exploit low-frequency or high-frequency information can enhance attack performance, leading to an unclear relationship between adversarial perturbations and different frequency components. In this paper, we seek to demystify this relationship by exploring the characteristics of adversarial perturbations within the frequency domain. We employ wavelet packet decomposition for detailed frequency analysis of adversarial examples and conduct statistical examinations across various frequency bands. Intriguingly, our findings indicate that significant adversarial perturbations are present within the high-frequency components of low-frequency bands. Drawing on this insight, we propose a black-box adversarial attack algorithm based on combining different frequency bands. Experiments conducted on multiple datasets and models demonstrate that combining low-frequency bands and high-frequency components of low-frequency bands can significantly enhance attack efficiency. The average attack success rate reaches 99\%, surpassing attacks that utilize a single frequency segment. Additionally, we introduce the normalized disturbance visibility index as a solution to the limitations of $L_2$ norm in assessing continuous and discrete perturbations.

Towards a Novel Perspective on Adversarial Examples Driven by Frequency

TL;DR

This paper investigates how adversarial perturbations distribute across frequency components using wavelet packet decomposition, challenging the view of perturbations as purely high- or low-frequency. It reveals that significant perturbations predominantly reside in the high-frequency parts of low-frequency bands and leverages this insight to design a black-box attack that combines perturbations across selected bands, notably and the high-frequency components and . The authors introduce the Normalized Disturbance Visibility (NDV) index to better align imperceptibility with human perception, defined as . Across CIFAR-10/100 and ImageNet-1K with multiple architectures, the proposed method achieves near 99% attack success and demonstrates superior query efficiency compared with baselines, highlighting the value of a frequency-centric view for adversarial robustness and defense design.

Abstract

Enhancing our understanding of adversarial examples is crucial for the secure application of machine learning models in real-world scenarios. A prevalent method for analyzing adversarial examples is through a frequency-based approach. However, existing research indicates that attacks designed to exploit low-frequency or high-frequency information can enhance attack performance, leading to an unclear relationship between adversarial perturbations and different frequency components. In this paper, we seek to demystify this relationship by exploring the characteristics of adversarial perturbations within the frequency domain. We employ wavelet packet decomposition for detailed frequency analysis of adversarial examples and conduct statistical examinations across various frequency bands. Intriguingly, our findings indicate that significant adversarial perturbations are present within the high-frequency components of low-frequency bands. Drawing on this insight, we propose a black-box adversarial attack algorithm based on combining different frequency bands. Experiments conducted on multiple datasets and models demonstrate that combining low-frequency bands and high-frequency components of low-frequency bands can significantly enhance attack efficiency. The average attack success rate reaches 99\%, surpassing attacks that utilize a single frequency segment. Additionally, we introduce the normalized disturbance visibility index as a solution to the limitations of norm in assessing continuous and discrete perturbations.
Paper Structure (18 sections, 9 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 18 sections, 9 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: After an initial decomposition, the original image is split into two components: ${a}$, representing the low-frequency band, and ${d}$, representing the high-frequency band at the $1^{st}$ layer. Subsequent decomposition of both ${a}$ and ${d}$ results in four distinct frequency bands at the $2^{nd}$ layer. This process continues, with the bands from the $2^{nd}$ layer being further decomposed to yield eight frequency bands at the $3^{rd}$ layer. Among these, the $\{aaa\}$ band preserves the core attributes of the original image, such as color, brightness, and general structure. Conversely, the other seven bands are primarily linked to edge definition.
  • Figure 2: This analysis compares the average cosine similarity between adversarial and pure examples across different frequency bands. The first decomposition yields the $\{a, d\}$ bands, and subsequent decompositions of $\{a\}$ and $\{d\}$ result in the $\{aa, da\}$ and $\{ad, dd\}$ bands, and so forth.
  • Figure 3: The proposed attack framework. This method is divided into two parts: (a) decomposing the digital image into multiple frequency bands using wavelet packet decomposition, and (b) implementing adversarial attacks based on frequency bands selection and perturbation addition strategies.
  • Figure 4: We compare some adversarial examples and their perturbations, all normalized to the same $L_2$ distance. These include (a) PGD, (b) our attack, and (c) Patch attack. For visualization purposes, the perturbations have been amplified by a factor of 10.
  • Figure 5: Performance of various attacks on ResNet-50 (left) and DenseNet-121 (right).
  • ...and 1 more figures