Table of Contents
Fetching ...

Efficient Backdoor Attacks for Deep Neural Networks in Real-world Scenarios

Ziqiang Li, Hong Sun, Pengfei Xia, Heng Li, Beihao Xia, Yi Wu, Bin Li

TL;DR

This work tackles backdoor attacks under realistic data collection conditions where adversaries lack full access to training data, introducing the data-constrained backdoor attack paradigm. It identifies entanglement between benign and poisoning features as a key bottleneck and proposes a CLIP-guided framework with three techniques: CLIP-CFE for clean feature suppression, and CLIP-UAP and CLIP-CFA for poisoning feature augmentation. Through extensive experiments on CIFAR-10/100 and ImageNet-50 across multiple architectures, the authors demonstrate substantial performance gains over baseline attacks, with some settings achieving over 100% improvement in data-constrained scenarios, while preserving Benign Accuracy. The results underscore practical threat potential and offer actionable insights for defense and robust evaluation in multi-source data settings, while also suggesting future work in domain-specific CLIP adaptations and broader applicability.

Abstract

Recent deep neural networks (DNNs) have came to rely on vast amounts of training data, providing an opportunity for malicious attackers to exploit and contaminate the data to carry out backdoor attacks. However, existing backdoor attack methods make unrealistic assumptions, assuming that all training data comes from a single source and that attackers have full access to the training data. In this paper, we introduce a more realistic attack scenario where victims collect data from multiple sources, and attackers cannot access the complete training data. We refer to this scenario as data-constrained backdoor attacks. In such cases, previous attack methods suffer from severe efficiency degradation due to the entanglement between benign and poisoning features during the backdoor injection process. To tackle this problem, we introduce three CLIP-based technologies from two distinct streams: Clean Feature Suppression and Poisoning Feature Augmentation.effective solution for data-constrained backdoor attacks. The results demonstrate remarkable improvements, with some settings achieving over 100% improvement compared to existing attacks in data-constrained scenarios. Code is available at https://github.com/sunh1113/Efficient-backdoor-attacks-for-deep-neural-networks-in-real-world-scenarios

Efficient Backdoor Attacks for Deep Neural Networks in Real-world Scenarios

TL;DR

This work tackles backdoor attacks under realistic data collection conditions where adversaries lack full access to training data, introducing the data-constrained backdoor attack paradigm. It identifies entanglement between benign and poisoning features as a key bottleneck and proposes a CLIP-guided framework with three techniques: CLIP-CFE for clean feature suppression, and CLIP-UAP and CLIP-CFA for poisoning feature augmentation. Through extensive experiments on CIFAR-10/100 and ImageNet-50 across multiple architectures, the authors demonstrate substantial performance gains over baseline attacks, with some settings achieving over 100% improvement in data-constrained scenarios, while preserving Benign Accuracy. The results underscore practical threat potential and offer actionable insights for defense and robust evaluation in multi-source data settings, while also suggesting future work in domain-specific CLIP adaptations and broader applicability.

Abstract

Recent deep neural networks (DNNs) have came to rely on vast amounts of training data, providing an opportunity for malicious attackers to exploit and contaminate the data to carry out backdoor attacks. However, existing backdoor attack methods make unrealistic assumptions, assuming that all training data comes from a single source and that attackers have full access to the training data. In this paper, we introduce a more realistic attack scenario where victims collect data from multiple sources, and attackers cannot access the complete training data. We refer to this scenario as data-constrained backdoor attacks. In such cases, previous attack methods suffer from severe efficiency degradation due to the entanglement between benign and poisoning features during the backdoor injection process. To tackle this problem, we introduce three CLIP-based technologies from two distinct streams: Clean Feature Suppression and Poisoning Feature Augmentation.effective solution for data-constrained backdoor attacks. The results demonstrate remarkable improvements, with some settings achieving over 100% improvement compared to existing attacks in data-constrained scenarios. Code is available at https://github.com/sunh1113/Efficient-backdoor-attacks-for-deep-neural-networks-in-real-world-scenarios
Paper Structure (57 sections, 17 equations, 28 figures, 6 tables)

This paper contains 57 sections, 17 equations, 28 figures, 6 tables.

Figures (28)

  • Figure 1: One-to-one (O2O) and many-to-one (M2O) data collection modes. M2O mode is more in line with practical scenarios where data collectors collect data from multiple sources. In this mode, the attacker cannot have all the data available to the victims.
  • Figure 2: Attack success rate (ASR) in the different data-constrained backdoor attack. The experiment is repeated 5 times, and the solid lines represent the mean results. (a): The abscissa is the number ($P$) of samples in the poisoning set $\mathcal{P'}$. (b): The experiments are conducted with triggers BadNets, Blended, and UAP, with a poisoning rate of $2\%$ ($P=1000$) for each. The x-axis represents the number of classes ($|Y'|$) in the poisoning set $\mathcal{P'}$. Specifically, '1 (1)' and '1 (0)' denote dirty-label single-class ($Y'=\{c\}, c\neq k$) and clean-label single-class ($Y'=\{k\}$), respectively. (c): The poisoning rates of experiments with trigger BadNets, Blended, and UAP are $2\%$ ($P=1000$), $2\%$ ($P=1000$), and $1\%$ ($P=500$), respectively. The abscissa is the domain rate that represents the proportion of poisoning sets sampled from $\mathcal{D}\setminus\mathcal{D'}$ and $\mathcal{D'}$).
  • Figure 3: Attack success rate (ASR) of the (a): number-constrained backdoor attacks, (b): clean-label single-class attack (the access category $Y'$ is set to $\{0\}$), (c): dirty-label single-class attack (the access category $Y'$ is set to $\{1\}$), and (d): domain-constrained backdoor attacks (domain rate is set to 0) on the CIFAR-100 dataset. The red points represents w/o CLIP-based Clean Feature Erasing (CLIP-CFE), while the green points represents w/ CLIP-CFE. All experiments are repeated 5 times, and the results are computed withthe mean of five different runs.
  • Figure 4: The ablation studies on the CIFAR-100 dataset. All results were computed as the mean of five different runs. (a): The ASR with different poisoning rates. (B): The ASR with different accessible class of poisoning samples, where 1 (0) and 1 (1) in the abscissa represent the clean-label and dirty-label single-class attacks, respectively. (C): The ASR with different domain rates. (D): The ASR across different pre-trained CLIP models for number-constrained backdoor attacks.
  • Figure 5: Three data-constrained attack scenarios, where the data provided by each data source is independently and identically distribution in number-constrained backdoor attacks, each data source provides data belonging to different categories in class-constrained backdoor attacks, and each data source provides data from different domains in domain-constrained backdoor attacks.
  • ...and 23 more figures