TPatch: A Triggered Physical Adversarial Patch
Wenjun Zhu, Xiaoyu Ji, Yushi Cheng, Shibo Zhang, Wenyuan Xu
TL;DR
TPatch introduces a novel physical adversarial patch that is benign until activated by a designed acoustic trigger, enabling targeted hiding, creation, or alteration of AV perception. The framework couples trigger design (blur-based acoustically induced image distortion), trigger-oriented patch optimization (losses for HA/CA/AA with $L_{pos}$,$L_{neg}$,$L_{TV}$,$L_{content}$), content-based camouflage (perceptual loss using CNN features), and robustness enhancement (EoT and trigger-region enlargement). Empirical results across three detectors (YOLO V3/V5, Faster R-CNN) and eight classifiers show high attack success in both simulation and real-world driving, with transferability demonstrations and analysis of environmental factors. The work also discusses sensor-level defenses and limitations, highlighting the need for holistic AV system defenses against sensor-triggered adversarial patches and outlining avenues for future research.
Abstract
Autonomous vehicles increasingly utilize the vision-based perception module to acquire information about driving environments and detect obstacles. Correct detection and classification are important to ensure safe driving decisions. Existing works have demonstrated the feasibility of fooling the perception models such as object detectors and image classifiers with printed adversarial patches. However, most of them are indiscriminately offensive to every passing autonomous vehicle. In this paper, we propose TPatch, a physical adversarial patch triggered by acoustic signals. Unlike other adversarial patches, TPatch remains benign under normal circumstances but can be triggered to launch a hiding, creating or altering attack by a designed distortion introduced by signal injection attacks towards cameras. To avoid the suspicion of human drivers and make the attack practical and robust in the real world, we propose a content-based camouflage method and an attack robustness enhancement method to strengthen it. Evaluations with three object detectors, YOLO V3/V5 and Faster R-CNN, and eight image classifiers demonstrate the effectiveness of TPatch in both the simulation and the real world. We also discuss possible defenses at the sensor, algorithm, and system levels.
