Table of Contents
Fetching ...

Impart: An Imperceptible and Effective Label-Specific Backdoor Attack

Jingke Zhao, Zan Wang, Yongwei Wang, Lanjun Wang

TL;DR

This study proposes a novel imperceptible backdoor attack framework, named Impart, in the scenario where the attacker has no access to the victim model and proposes a label-specific attack, which significantly enhances the attack capability of the all-to-all setting.

Abstract

Backdoor attacks have been shown to impose severe threats to real security-critical scenarios. Although previous works can achieve high attack success rates, they either require access to victim models which may significantly reduce their threats in practice, or perform visually noticeable in stealthiness. Besides, there is still room to improve the attack success rates in the scenario that different poisoned samples may have different target labels (a.k.a., the all-to-all setting). In this study, we propose a novel imperceptible backdoor attack framework, named Impart, in the scenario where the attacker has no access to the victim model. Specifically, in order to enhance the attack capability of the all-to-all setting, we first propose a label-specific attack. Different from previous works which try to find an imperceptible pattern and add it to the source image as the poisoned image, we then propose to generate perturbations that align with the target label in the image feature by a surrogate model. In this way, the generated poisoned images are attached with knowledge about the target class, which significantly enhances the attack capability.

Impart: An Imperceptible and Effective Label-Specific Backdoor Attack

TL;DR

This study proposes a novel imperceptible backdoor attack framework, named Impart, in the scenario where the attacker has no access to the victim model and proposes a label-specific attack, which significantly enhances the attack capability of the all-to-all setting.

Abstract

Backdoor attacks have been shown to impose severe threats to real security-critical scenarios. Although previous works can achieve high attack success rates, they either require access to victim models which may significantly reduce their threats in practice, or perform visually noticeable in stealthiness. Besides, there is still room to improve the attack success rates in the scenario that different poisoned samples may have different target labels (a.k.a., the all-to-all setting). In this study, we propose a novel imperceptible backdoor attack framework, named Impart, in the scenario where the attacker has no access to the victim model. Specifically, in order to enhance the attack capability of the all-to-all setting, we first propose a label-specific attack. Different from previous works which try to find an imperceptible pattern and add it to the source image as the poisoned image, we then propose to generate perturbations that align with the target label in the image feature by a surrogate model. In this way, the generated poisoned images are attached with knowledge about the target class, which significantly enhances the attack capability.
Paper Structure (17 sections, 8 equations, 6 figures, 10 tables)

This paper contains 17 sections, 8 equations, 6 figures, 10 tables.

Figures (6)

  • Figure 1: The framework of the proposed Impart method consists of four phases. In phase ①, we train a surrogate model. In phase ②, we generate poisoned training data and poisoned test data using the feature fitter. In phases ③ and ④, we poison and test the victim model respectively.
  • Figure 2: Resilience to STRIP. Entropy distributions on CIFAR-10, GTSRB and CIFAR-100.
  • Figure 3: Effectiveness of our method under the Neural Cleanse defense on CIFAR-10, GTSRB, and CIFAR-100 datasets.
  • Figure 4: Resilience to Spectral Signatures.
  • Figure 5: The influence of different poisoned ratio. The left (a) is the influence of poisoned ratio on BA. The right (b) is the influence of poisoned ratio on ASR.
  • ...and 1 more figures