Table of Contents
Fetching ...

WPDA: Frequency-based Backdoor Attack with Wavelet Packet Decomposition

Zhengyao Song, Yongqiang Li, Danni Yuan, Li Liu, Shaokui Wei, Baoyuan Wu

TL;DR

This work presents WPDA, a dataset-specific backdoor attack that leverages Wavelet Packet Decomposition (WPD) to identify critical frequency regions that shape DNNs’ image classification behavior, and embeds a frequency-domain trigger into those regions. The method achieves exceptionally high attack success at ultra-low poisoning ratios (e.g., $p=0.004\%$) across CIFAR-10, CIFAR-100, and Tiny ImageNet, while maintaining strong stealth and evading several state-of-the-art defenses. The authors provide both quantitative results and qualitative analyses, including t-SNE and l2-distance KDE visualizations, to show WPDA’s trigger learning rather than memorization. Collectively, WPDA advances understanding of frequency-domain backdoors and highlights the need for defenses that specifically target dataset- and frequency-aware poisoning strategies.

Abstract

This work explores an emerging security threat against deep neural networks (DNNs) based image classification, i.e., backdoor attack. In this scenario, the attacker aims to inject a backdoor into the model by manipulating training data, such that the backdoor could be activated by a particular trigger and bootstraps the model to make a target prediction at inference. Currently, most existing data poisoning-based attacks struggle to achieve success at low poisoning ratios, increasing the risk of being defended by defense methods. In this paper, we propose a novel frequency-based backdoor attack via Wavelet Packet Decomposition (WPD), WPD decomposes the original image signal to a spectrogram that contains frequency information with different semantic meanings. We leverage WPD to statistically analyze the frequency distribution of the dataset to infer the key frequency regions the DNNs would focus on, and the trigger information is only injected into the key frequency regions. Our method mainly includes three parts: 1) the selection of the poisoning frequency regions in spectrogram; 2) trigger generation; 3) the generation of the poisoned dataset. Our method is stealthy and precise, evidenced by the 98.12% Attack Success Rate (ASR) on CIFAR-10 with the extremely low poisoning ratio 0.004% (i.e., only 2 poisoned samples among 50,000 training samples) and can bypass most existing defense methods. Besides, we also provide visualization analyses to explain why our method works.

WPDA: Frequency-based Backdoor Attack with Wavelet Packet Decomposition

TL;DR

This work presents WPDA, a dataset-specific backdoor attack that leverages Wavelet Packet Decomposition (WPD) to identify critical frequency regions that shape DNNs’ image classification behavior, and embeds a frequency-domain trigger into those regions. The method achieves exceptionally high attack success at ultra-low poisoning ratios (e.g., ) across CIFAR-10, CIFAR-100, and Tiny ImageNet, while maintaining strong stealth and evading several state-of-the-art defenses. The authors provide both quantitative results and qualitative analyses, including t-SNE and l2-distance KDE visualizations, to show WPDA’s trigger learning rather than memorization. Collectively, WPDA advances understanding of frequency-domain backdoors and highlights the need for defenses that specifically target dataset- and frequency-aware poisoning strategies.

Abstract

This work explores an emerging security threat against deep neural networks (DNNs) based image classification, i.e., backdoor attack. In this scenario, the attacker aims to inject a backdoor into the model by manipulating training data, such that the backdoor could be activated by a particular trigger and bootstraps the model to make a target prediction at inference. Currently, most existing data poisoning-based attacks struggle to achieve success at low poisoning ratios, increasing the risk of being defended by defense methods. In this paper, we propose a novel frequency-based backdoor attack via Wavelet Packet Decomposition (WPD), WPD decomposes the original image signal to a spectrogram that contains frequency information with different semantic meanings. We leverage WPD to statistically analyze the frequency distribution of the dataset to infer the key frequency regions the DNNs would focus on, and the trigger information is only injected into the key frequency regions. Our method mainly includes three parts: 1) the selection of the poisoning frequency regions in spectrogram; 2) trigger generation; 3) the generation of the poisoned dataset. Our method is stealthy and precise, evidenced by the 98.12% Attack Success Rate (ASR) on CIFAR-10 with the extremely low poisoning ratio 0.004% (i.e., only 2 poisoned samples among 50,000 training samples) and can bypass most existing defense methods. Besides, we also provide visualization analyses to explain why our method works.
Paper Structure (40 sections, 3 equations, 12 figures, 6 tables, 1 algorithm)

This paper contains 40 sections, 3 equations, 12 figures, 6 tables, 1 algorithm.

Figures (12)

  • Figure 1: Principle of wavelet packet decomposition. Region 'a' contains low frequency, region 'h' contains high-horizontal frequency, and region 'v' contains high-vertical frequency, Region 'd' contains high-diagonal frequency. WPD requires padding $L$ pixels on each edge of the image.
  • Figure 2: Impact of high and low frequency information on image vision.
  • Figure 3: Frequency information distribution of CIFAR-10 in 'a', 'h', 'v', 'd' parent-spectrogram, respectively.
  • Figure 4: Process of WPDA on poisoned samples generation. Before inserting the trigger, we perform frequency distribution statistics on all training samples based on WPD to identify the most critical sub-spectrogram in each parent-spectrogram as the poisoning regions (disregarding the lowest frequency region), and then generate the mask for the poisoned samples generation. In the process of poisoned samples generation, $\mathcal{T}$ represents average transformation.
  • Figure 7: The selected regions for CIFAR-10, CIFAR-100, and Tiny ImageNet, respectively.
  • ...and 7 more figures