Table of Contents
Fetching ...

The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector

Aixuan Li, Mochu Xiang, Jing Zhang, Yuchao Dai

TL;DR

The paper tackles vulnerabilities of BEV-based 3D object detectors to non-contact, 3D adversarial attacks. It introduces Meeseeks Mesh, a 3D adversarial object optimized via differentiable rendering, occlusion-aware masking, and BEV spatial feature guidance to achieve 3D-consistent attacks across time and views. Key contributions include a unified pipeline for mesh placement, 3D-consistent rendering, an occlusion processing module, and a BEV-focused optimization objective that suppresses target detections while inducing misperceptions elsewhere. The approach demonstrates cross-model vulnerability on nuScenes benchmarks and highlights practical implications for robustness testing and defense design in autonomous driving perception systems.

Abstract

3D object detection is a critical component in autonomous driving systems. It allows real-time recognition and detection of vehicles, pedestrians and obstacles under varying environmental conditions. Among existing methods, 3D object detection in the Bird's Eye View (BEV) has emerged as the mainstream framework. To guarantee a safe, robust and trustworthy 3D object detection, 3D adversarial attacks are investigated, where attacks are placed in 3D environments to evaluate the model performance, e.g. putting a film on a car, clothing a pedestrian. The vulnerability of 3D object detection models to 3D adversarial attacks serves as an important indicator to evaluate the robustness of the model against perturbations. To investigate this vulnerability, we generate non-invasive 3D adversarial objects tailored for real-world attack scenarios. Our method verifies the existence of universal adversarial objects that are spatially consistent across time and camera views. Specifically, we employ differentiable rendering techniques to accurately model the spatial relationship between adversarial objects and the target vehicle. Furthermore, we introduce an occlusion-aware module to enhance visual consistency and realism under different viewpoints. To maintain attack effectiveness across multiple frames, we design a BEV spatial feature-guided optimization strategy. Experimental results demonstrate that our approach can reliably suppress vehicle predictions from state-of-the-art 3D object detectors, serving as an important tool to test robustness of 3D object detection models before deployment. Moreover, the generated adversarial objects exhibit strong generalization capabilities, retaining its effectiveness at various positions and distances in the scene.

The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector

TL;DR

The paper tackles vulnerabilities of BEV-based 3D object detectors to non-contact, 3D adversarial attacks. It introduces Meeseeks Mesh, a 3D adversarial object optimized via differentiable rendering, occlusion-aware masking, and BEV spatial feature guidance to achieve 3D-consistent attacks across time and views. Key contributions include a unified pipeline for mesh placement, 3D-consistent rendering, an occlusion processing module, and a BEV-focused optimization objective that suppresses target detections while inducing misperceptions elsewhere. The approach demonstrates cross-model vulnerability on nuScenes benchmarks and highlights practical implications for robustness testing and defense design in autonomous driving perception systems.

Abstract

3D object detection is a critical component in autonomous driving systems. It allows real-time recognition and detection of vehicles, pedestrians and obstacles under varying environmental conditions. Among existing methods, 3D object detection in the Bird's Eye View (BEV) has emerged as the mainstream framework. To guarantee a safe, robust and trustworthy 3D object detection, 3D adversarial attacks are investigated, where attacks are placed in 3D environments to evaluate the model performance, e.g. putting a film on a car, clothing a pedestrian. The vulnerability of 3D object detection models to 3D adversarial attacks serves as an important indicator to evaluate the robustness of the model against perturbations. To investigate this vulnerability, we generate non-invasive 3D adversarial objects tailored for real-world attack scenarios. Our method verifies the existence of universal adversarial objects that are spatially consistent across time and camera views. Specifically, we employ differentiable rendering techniques to accurately model the spatial relationship between adversarial objects and the target vehicle. Furthermore, we introduce an occlusion-aware module to enhance visual consistency and realism under different viewpoints. To maintain attack effectiveness across multiple frames, we design a BEV spatial feature-guided optimization strategy. Experimental results demonstrate that our approach can reliably suppress vehicle predictions from state-of-the-art 3D object detectors, serving as an important tool to test robustness of 3D object detection models before deployment. Moreover, the generated adversarial objects exhibit strong generalization capabilities, retaining its effectiveness at various positions and distances in the scene.

Paper Structure

This paper contains 20 sections, 7 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Comparison of different adversarial attacks for 3D object detection with respect to 3D consistency and non-invasiveness.
  • Figure 2: Overview of our adversarial object generation pipeline. Appropriate locations are first chosen to place adversarial meshes in the 3D scene ("Mesh Placement in 3D Scene"), which are then rendered onto the input images. Our differentiable renderer ensures 3D-consistent, multi-view renderings with correct perspective. The "Realistic Occlusion Processing Module" further simulates partial visibility for improved robustness. Finally, the adversarial object is optimized via "BEV Spatial Feature-Guided Optimization" to enable effective attacks across both time and space.
  • Figure 3: Visualizations of attack effects in image view. with comparison with Adv3D liadv3d.
  • Figure 4: Visualization of adversarial objects generated for different models.
  • Figure 5: Visualizations of attack effects in the BEV. Top: predictions with initial objects. Bottom: predictions after inserting the adversarial object. Blue/red indicate ground-truth/predicted boxes for vehicles; green/cyan for other object categories.
  • ...and 3 more figures