Table of Contents
Fetching ...

TPatch: A Triggered Physical Adversarial Patch

Wenjun Zhu, Xiaoyu Ji, Yushi Cheng, Shibo Zhang, Wenyuan Xu

TL;DR

TPatch introduces a novel physical adversarial patch that is benign until activated by a designed acoustic trigger, enabling targeted hiding, creation, or alteration of AV perception. The framework couples trigger design (blur-based acoustically induced image distortion), trigger-oriented patch optimization (losses for HA/CA/AA with $L_{pos}$,$L_{neg}$,$L_{TV}$,$L_{content}$), content-based camouflage (perceptual loss using CNN features), and robustness enhancement (EoT and trigger-region enlargement). Empirical results across three detectors (YOLO V3/V5, Faster R-CNN) and eight classifiers show high attack success in both simulation and real-world driving, with transferability demonstrations and analysis of environmental factors. The work also discusses sensor-level defenses and limitations, highlighting the need for holistic AV system defenses against sensor-triggered adversarial patches and outlining avenues for future research.

Abstract

Autonomous vehicles increasingly utilize the vision-based perception module to acquire information about driving environments and detect obstacles. Correct detection and classification are important to ensure safe driving decisions. Existing works have demonstrated the feasibility of fooling the perception models such as object detectors and image classifiers with printed adversarial patches. However, most of them are indiscriminately offensive to every passing autonomous vehicle. In this paper, we propose TPatch, a physical adversarial patch triggered by acoustic signals. Unlike other adversarial patches, TPatch remains benign under normal circumstances but can be triggered to launch a hiding, creating or altering attack by a designed distortion introduced by signal injection attacks towards cameras. To avoid the suspicion of human drivers and make the attack practical and robust in the real world, we propose a content-based camouflage method and an attack robustness enhancement method to strengthen it. Evaluations with three object detectors, YOLO V3/V5 and Faster R-CNN, and eight image classifiers demonstrate the effectiveness of TPatch in both the simulation and the real world. We also discuss possible defenses at the sensor, algorithm, and system levels.

TPatch: A Triggered Physical Adversarial Patch

TL;DR

TPatch introduces a novel physical adversarial patch that is benign until activated by a designed acoustic trigger, enabling targeted hiding, creation, or alteration of AV perception. The framework couples trigger design (blur-based acoustically induced image distortion), trigger-oriented patch optimization (losses for HA/CA/AA with ,,,), content-based camouflage (perceptual loss using CNN features), and robustness enhancement (EoT and trigger-region enlargement). Empirical results across three detectors (YOLO V3/V5, Faster R-CNN) and eight classifiers show high attack success in both simulation and real-world driving, with transferability demonstrations and analysis of environmental factors. The work also discusses sensor-level defenses and limitations, highlighting the need for holistic AV system defenses against sensor-triggered adversarial patches and outlining avenues for future research.

Abstract

Autonomous vehicles increasingly utilize the vision-based perception module to acquire information about driving environments and detect obstacles. Correct detection and classification are important to ensure safe driving decisions. Existing works have demonstrated the feasibility of fooling the perception models such as object detectors and image classifiers with printed adversarial patches. However, most of them are indiscriminately offensive to every passing autonomous vehicle. In this paper, we propose TPatch, a physical adversarial patch triggered by acoustic signals. Unlike other adversarial patches, TPatch remains benign under normal circumstances but can be triggered to launch a hiding, creating or altering attack by a designed distortion introduced by signal injection attacks towards cameras. To avoid the suspicion of human drivers and make the attack practical and robust in the real world, we propose a content-based camouflage method and an attack robustness enhancement method to strengthen it. Evaluations with three object detectors, YOLO V3/V5 and Faster R-CNN, and eight image classifiers demonstrate the effectiveness of TPatch in both the simulation and the real world. We also discuss possible defenses at the sensor, algorithm, and system levels.
Paper Structure (40 sections, 16 equations, 19 figures, 8 tables)

This paper contains 40 sections, 16 equations, 19 figures, 8 tables.

Figures (19)

  • Figure 1: TPatch attacks. The patch is preset at the roadside, which is benign to most passing vehicles such as AV1 and AV3, but can render the targeted vehicle under acoustic injection attacks (AV2) recognize a non-existing stop sign, leading to tragic results.
  • Figure 2: Overview of TPatch generation. Based on the selected physical trigger signal, the adversary first estimates the image distortion caused by it and designs the positive and negative triggers. Then, she trains a TPatch in accord with the designed triggers, and finally improves the visual camouflage and the robustness of the patch to make it more practical in the real world. The generated TPatch can then be attached to any objects to launch hiding, creating or altering attacks.
  • Figure 3: Diagram of the trigger design. ① Explore the resonant frequency by injecting frequency-modulated acoustic signals. ② Estimate the PSF kernel with the clear image and the corresponding blurred image. ③ Extract the strength and orientation of blur.
  • Figure 4: Comparison between MSE loss and content loss. (a) shows the target camouflage of the patch. (b) and (c) show the generated patch by MSE loss and content loss respectively.
  • Figure 5: Illustration of the relationship between the triggerable region and the designed triggers.
  • ...and 14 more figures