Table of Contents
Fetching ...

DFB: A Data-Free, Low-Budget, and High-Efficacy Clean-Label Backdoor Attack

Binhao Ma, Jiahui Wang, Dejun Wang, Bo Meng

TL;DR

This work addresses the practical limitation that many clean-label backdoor attacks require access to the victim's training data. It introduces DFB, a data-free clean-label backdoor attack that leverages Public Out-of-Distribution (POOD) samples and a decoder–encoder pipeline to generate triggers without target-class features, using two trigger-generation strategies. On CIFAR-10, Tiny-ImageNet, and TSRD, DFB achieves ultra-low poisoning rates (as low as 0.1% in CIFAR-10, 0.025% in Tiny-ImageNet, and 0.4% in TSRD) with high attack success rates and preserved clean accuracy, outperforming several baselines. Moreover, DFB demonstrates robustness against four standard defenses (Neural Cleanse, pruning, STRIP, SentiNet), underscoring the need for data-free defense mechanisms in real-world settings.

Abstract

In the domain of backdoor attacks, accurate labeling of injected data is essential for evading rudimentary detection mechanisms. This imperative has catalyzed the development of clean-label attacks, which are notably more elusive as they preserve the original labels of the injected data. Current clean-label attack methodologies primarily depend on extensive knowledge of the training dataset. However, practically, such comprehensive dataset access is often unattainable, given that training datasets are typically compiled from various independent sources. Departing from conventional clean-label attack methodologies, our research introduces DFB, a data-free, low-budget, and high-efficacy clean-label backdoor Attack. DFB is unique in its independence from training data access, requiring solely the knowledge of a specific target class. Tested on CIFAR10, Tiny-ImageNet, and TSRD, DFB demonstrates remarkable efficacy with minimal poisoning rates of just 0.1%, 0.025%, and 0.4%, respectively. These rates are significantly lower than those required by existing methods such as LC, HTBA, BadNets, and Blend, yet DFB achieves superior attack success rates. Furthermore, our findings reveal that DFB poses a formidable challenge to four established backdoor defense algorithms, indicating its potential as a robust tool in advanced clean-label attack strategies.

DFB: A Data-Free, Low-Budget, and High-Efficacy Clean-Label Backdoor Attack

TL;DR

This work addresses the practical limitation that many clean-label backdoor attacks require access to the victim's training data. It introduces DFB, a data-free clean-label backdoor attack that leverages Public Out-of-Distribution (POOD) samples and a decoder–encoder pipeline to generate triggers without target-class features, using two trigger-generation strategies. On CIFAR-10, Tiny-ImageNet, and TSRD, DFB achieves ultra-low poisoning rates (as low as 0.1% in CIFAR-10, 0.025% in Tiny-ImageNet, and 0.4% in TSRD) with high attack success rates and preserved clean accuracy, outperforming several baselines. Moreover, DFB demonstrates robustness against four standard defenses (Neural Cleanse, pruning, STRIP, SentiNet), underscoring the need for data-free defense mechanisms in real-world settings.

Abstract

In the domain of backdoor attacks, accurate labeling of injected data is essential for evading rudimentary detection mechanisms. This imperative has catalyzed the development of clean-label attacks, which are notably more elusive as they preserve the original labels of the injected data. Current clean-label attack methodologies primarily depend on extensive knowledge of the training dataset. However, practically, such comprehensive dataset access is often unattainable, given that training datasets are typically compiled from various independent sources. Departing from conventional clean-label attack methodologies, our research introduces DFB, a data-free, low-budget, and high-efficacy clean-label backdoor Attack. DFB is unique in its independence from training data access, requiring solely the knowledge of a specific target class. Tested on CIFAR10, Tiny-ImageNet, and TSRD, DFB demonstrates remarkable efficacy with minimal poisoning rates of just 0.1%, 0.025%, and 0.4%, respectively. These rates are significantly lower than those required by existing methods such as LC, HTBA, BadNets, and Blend, yet DFB achieves superior attack success rates. Furthermore, our findings reveal that DFB poses a formidable challenge to four established backdoor defense algorithms, indicating its potential as a robust tool in advanced clean-label attack strategies.
Paper Structure (14 sections, 2 equations, 11 figures, 4 tables, 2 algorithms)

This paper contains 14 sections, 2 equations, 11 figures, 4 tables, 2 algorithms.

Figures (11)

  • Figure 1: The DFB Pipeline employs two trigger selection methods: one to find loss-max triggers and the other to dynamically create and inject triggers into the clean dataset. In inference, it processes clean data normally but assigns the target label to trigger-embedded data.
  • Figure 2: Visual Comparison of Poisons
  • Figure 3: Comparison of Attack Effects on Different Classes
  • Figure 4: Comparison of attack success rate and prediction accuracy of two DFB attack methods with other attack methods at different poisoning rates
  • Figure 5: The impact of ASR on different decoder network architectures
  • ...and 6 more figures