WPDA: Frequency-based Backdoor Attack with Wavelet Packet Decomposition
Zhengyao Song, Yongqiang Li, Danni Yuan, Li Liu, Shaokui Wei, Baoyuan Wu
TL;DR
This work presents WPDA, a dataset-specific backdoor attack that leverages Wavelet Packet Decomposition (WPD) to identify critical frequency regions that shape DNNs’ image classification behavior, and embeds a frequency-domain trigger into those regions. The method achieves exceptionally high attack success at ultra-low poisoning ratios (e.g., $p=0.004\%$) across CIFAR-10, CIFAR-100, and Tiny ImageNet, while maintaining strong stealth and evading several state-of-the-art defenses. The authors provide both quantitative results and qualitative analyses, including t-SNE and l2-distance KDE visualizations, to show WPDA’s trigger learning rather than memorization. Collectively, WPDA advances understanding of frequency-domain backdoors and highlights the need for defenses that specifically target dataset- and frequency-aware poisoning strategies.
Abstract
This work explores an emerging security threat against deep neural networks (DNNs) based image classification, i.e., backdoor attack. In this scenario, the attacker aims to inject a backdoor into the model by manipulating training data, such that the backdoor could be activated by a particular trigger and bootstraps the model to make a target prediction at inference. Currently, most existing data poisoning-based attacks struggle to achieve success at low poisoning ratios, increasing the risk of being defended by defense methods. In this paper, we propose a novel frequency-based backdoor attack via Wavelet Packet Decomposition (WPD), WPD decomposes the original image signal to a spectrogram that contains frequency information with different semantic meanings. We leverage WPD to statistically analyze the frequency distribution of the dataset to infer the key frequency regions the DNNs would focus on, and the trigger information is only injected into the key frequency regions. Our method mainly includes three parts: 1) the selection of the poisoning frequency regions in spectrogram; 2) trigger generation; 3) the generation of the poisoned dataset. Our method is stealthy and precise, evidenced by the 98.12% Attack Success Rate (ASR) on CIFAR-10 with the extremely low poisoning ratio 0.004% (i.e., only 2 poisoned samples among 50,000 training samples) and can bypass most existing defense methods. Besides, we also provide visualization analyses to explain why our method works.
