Table of Contents
Fetching ...

BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection

Saket S. Chaturvedi, Lan Zhang, Wenbin Zhang, Pan He, Xiaoyong Yuan

TL;DR

BadFusion reveals a new security risk in multi-modal 3D object detection by showing that 2D camera triggers can be crafted to survive LiDAR-camera fusion and mislead 3D predictions. It introduces fusion-aware 2D triggers placed in dense 2D-LiDAR projection regions and a LiDAR-free variant that predicts dense regions from camera data, achieving high attack success rates across multiple fusion models while preserving benign performance ($\text{ASR} \approx 95\%$). Compared with prior 2D-oriented attacks, BadFusion substantially boosts effectiveness by maintaining trigger density and consistency through the fusion process. The work highlights a critical vulnerability in contemporary autonomous-driving perception pipelines and calls for defenses against such fusion-aware backdoors.

Abstract

3D object detection plays an important role in autonomous driving; however, its vulnerability to backdoor attacks has become evident. By injecting ''triggers'' to poison the training dataset, backdoor attacks manipulate the detector's prediction for inputs containing these triggers. Existing backdoor attacks against 3D object detection primarily poison 3D LiDAR signals, where large-sized 3D triggers are injected to ensure their visibility within the sparse 3D space, rendering them easy to detect and impractical in real-world scenarios. In this paper, we delve into the robustness of 3D object detection, exploring a new backdoor attack surface through 2D cameras. Given the prevalent adoption of camera and LiDAR signal fusion for high-fidelity 3D perception, we investigate the latent potential of camera signals to disrupt the process. Although the dense nature of camera signals enables the use of nearly imperceptible small-sized triggers to mislead 2D object detection, realizing 2D-oriented backdoor attacks against 3D object detection is non-trivial. The primary challenge emerges from the fusion process that transforms camera signals into a 3D space, compromising the association with the 2D trigger to the target output. To tackle this issue, we propose an innovative 2D-oriented backdoor attack against LiDAR-camera fusion methods for 3D object detection, named BadFusion, for preserving trigger effectiveness throughout the entire fusion process. The evaluation demonstrates the effectiveness of BadFusion, achieving a significantly higher attack success rate compared to existing 2D-oriented attacks.

BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection

TL;DR

BadFusion reveals a new security risk in multi-modal 3D object detection by showing that 2D camera triggers can be crafted to survive LiDAR-camera fusion and mislead 3D predictions. It introduces fusion-aware 2D triggers placed in dense 2D-LiDAR projection regions and a LiDAR-free variant that predicts dense regions from camera data, achieving high attack success rates across multiple fusion models while preserving benign performance (). Compared with prior 2D-oriented attacks, BadFusion substantially boosts effectiveness by maintaining trigger density and consistency through the fusion process. The work highlights a critical vulnerability in contemporary autonomous-driving perception pipelines and calls for defenses against such fusion-aware backdoors.

Abstract

3D object detection plays an important role in autonomous driving; however, its vulnerability to backdoor attacks has become evident. By injecting ''triggers'' to poison the training dataset, backdoor attacks manipulate the detector's prediction for inputs containing these triggers. Existing backdoor attacks against 3D object detection primarily poison 3D LiDAR signals, where large-sized 3D triggers are injected to ensure their visibility within the sparse 3D space, rendering them easy to detect and impractical in real-world scenarios. In this paper, we delve into the robustness of 3D object detection, exploring a new backdoor attack surface through 2D cameras. Given the prevalent adoption of camera and LiDAR signal fusion for high-fidelity 3D perception, we investigate the latent potential of camera signals to disrupt the process. Although the dense nature of camera signals enables the use of nearly imperceptible small-sized triggers to mislead 2D object detection, realizing 2D-oriented backdoor attacks against 3D object detection is non-trivial. The primary challenge emerges from the fusion process that transforms camera signals into a 3D space, compromising the association with the 2D trigger to the target output. To tackle this issue, we propose an innovative 2D-oriented backdoor attack against LiDAR-camera fusion methods for 3D object detection, named BadFusion, for preserving trigger effectiveness throughout the entire fusion process. The evaluation demonstrates the effectiveness of BadFusion, achieving a significantly higher attack success rate compared to existing 2D-oriented attacks.
Paper Structure (26 sections, 3 equations, 8 figures, 7 tables, 1 algorithm)

This paper contains 26 sections, 3 equations, 8 figures, 7 tables, 1 algorithm.

Figures (8)

  • Figure 1: The pipeline of 2D (camera) and 3D (LiDAR) data fusion for 3D object detection in autonomous driving.
  • Figure 2: Comparison between existing 2D-oriented backdoor attacks and the proposed BadFusion. The first two columns show the camera signal with triggers and the 2D projection of the LiDAR signal. After transforming camera signals to 2D LiDAR projection during fusion, the triggers injected via existing 2D-oriented backdoor attacks become sparse and inconsistent (the third column), making triggers ineffective in attacks. Hence, these attacks do not change the predictions of 3D bounding boxes (the fourth column). The proposed BadFusion, by injecting dense and consistent triggers throughout the fusion process, successfully manipulates the detection and reduces the sizes of 3D bounding boxes for vehicles.
  • Figure 3: Examples of different attack goals in BadFusion. Fig (a) shows the predictions of a clean model without backdoor triggers. Fig (b) shows the predictions of a resizing attack, where the attack reduces the size of the predicted bounding box. Fig (c) shows the predictions of a disappear attack, where the attack removes the predicted bounding box of a vehicle from the prediction for disappearing the vehicle.
  • Figure 4: Different trigger patterns used in BadFusion.
  • Figure 5: Different distributions of backdoor sample selection in BadFusion against MVX-Net.
  • ...and 3 more figures