Table of Contents
Fetching ...

Backdoor Attack in the Physical World

Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, Shu-Tao Xia

TL;DR

The paper analyzes backdoor attacks with static triggers, showing their vulnerability to trigger location and appearance changes in the physical world. It proposes transformation-based preprocessing as a lightweight defense and introduces an attack-enhancement framework that uses randomized transformations during training to maintain effectiveness under such defenses. Experiments on CIFAR-10 with BadNets, Blended, and Consistent attacks demonstrate that ShrinkPad and Flip can substantially reduce attack success, while enhanced attacks remain robust and can even succeed in physical-world settings. The work highlights a practical arms race between backdoor robustness and defense and motivates designing trigger-robust attacks and generic defenses for real-world deployment.

Abstract

Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of infected models will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Currently, most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area. In this paper, we revisit this attack paradigm by analyzing trigger characteristics. We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training. As such, those attacks are far less effective in the physical world, where the location and appearance of the trigger in the digitized image may be different from that of the one used for training. Moreover, we also discuss how to alleviate such vulnerability. We hope that this work could inspire more explorations on backdoor properties, to help the design of more advanced backdoor attack and defense methods.

Backdoor Attack in the Physical World

TL;DR

The paper analyzes backdoor attacks with static triggers, showing their vulnerability to trigger location and appearance changes in the physical world. It proposes transformation-based preprocessing as a lightweight defense and introduces an attack-enhancement framework that uses randomized transformations during training to maintain effectiveness under such defenses. Experiments on CIFAR-10 with BadNets, Blended, and Consistent attacks demonstrate that ShrinkPad and Flip can substantially reduce attack success, while enhanced attacks remain robust and can even succeed in physical-world settings. The work highlights a practical arms race between backdoor robustness and defense and motivates designing trigger-robust attacks and generic defenses for real-world deployment.

Abstract

Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of infected models will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Currently, most existing backdoor attacks adopted the setting of static trigger, triggers across the training and testing images follow the same appearance and are located in the same area. In this paper, we revisit this attack paradigm by analyzing trigger characteristics. We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training. As such, those attacks are far less effective in the physical world, where the location and appearance of the trigger in the digitized image may be different from that of the one used for training. Moreover, we also discuss how to alleviate such vulnerability. We hope that this work could inspire more explorations on backdoor properties, to help the design of more advanced backdoor attack and defense methods.

Paper Structure

This paper contains 11 sections, 1 equation, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The heatmap of the attack success rate when the trigger is in different position at attacked images. The right corner is the position of the trigger in the poisoned images used for training.
  • Figure 2: ASR and appearance of the trigger with different non-zero color value in attacked images. The red dot indicates the ASR of trigger with original color value (128 pixels).
  • Figure 3: The illustration of characteristics of the backdoor trigger. The red box represents the boundary of the minimum covering box, and the red pixel indicates the trigger location.
  • Figure 4: Some printed CIFAR-10 images taken by a camera with different distances.

Theorems & Definitions (3)

  • Definition 1: Minimum Covering Box
  • Definition 2: Two Characteristics of Backdoor Trigger
  • Definition 3: Transformation-based Defense