Backdoor Attack in the Physical World
Yiming Li, Tongqing Zhai, Yong Jiang, Zhifeng Li, Shu-Tao Xia
TL;DR
The paper analyzes backdoor attacks with static triggers, showing their vulnerability to trigger location and appearance changes in the physical world. It proposes transformation-based preprocessing as a lightweight defense and introduces an attack-enhancement framework that uses randomized transformations during training to maintain effectiveness under such defenses. Experiments on CIFAR-10 with BadNets, Blended, and Consistent attacks demonstrate that ShrinkPad and Flip can substantially reduce attack success, while enhanced attacks remain robust and can even succeed in physical-world settings. The work highlights a practical arms race between backdoor robustness and defense and motivates designing trigger-robust attacks and generic defenses for real-world deployment.
Abstract
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of infected models will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Currently, most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area. In this paper, we revisit this attack paradigm by analyzing trigger characteristics. We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training. As such, those attacks are far less effective in the physical world, where the location and appearance of the trigger in the digitized image may be different from that of the one used for training. Moreover, we also discuss how to alleviate such vulnerability. We hope that this work could inspire more explorations on backdoor properties, to help the design of more advanced backdoor attack and defense methods.
