Table of Contents
Fetching ...

Adv3D: Generating 3D Adversarial Examples for 3D Object Detection in Driving Scenarios with NeRF

Leheng Li, Qing Lian, Ying-Cong Chen

TL;DR

Adv3D tackles the vulnerability of camera-based 3D object detectors in autonomous driving to physically realizable adversarial textures. It introduces a NeRF-based adversarial framework that renders photorealistic 3D patches and optimizes their textures under an Expectation Over Transformation (EOT) regime, with primitive-aware sampling, disentangled texture/shape modeling, and semantic-guided camouflage to ensure realism and transferability across poses and scenes. The approach demonstrates significant attack effectiveness across multiple detectors on the nuScenes dataset, while also proposing a defensible training strategy via data augmentation that partially restores detector performance. The work highlights a practical threat model and provides concrete methods to both evaluate robustness and guide the design of more resilient 3D perception systems for real-world driving.

Abstract

Deep neural networks (DNNs) have been proven extremely susceptible to adversarial examples, which raises special safety-critical concerns for DNN-based autonomous driving stacks (i.e., 3D object detection). Although there are extensive works on image-level attacks, most are restricted to 2D pixel spaces, and such attacks are not always physically realistic in our 3D world. Here we present Adv3D, the first exploration of modeling adversarial examples as Neural Radiance Fields (NeRFs). Advances in NeRF provide photorealistic appearances and 3D accurate generation, yielding a more realistic and realizable adversarial example. We train our adversarial NeRF by minimizing the surrounding objects' confidence predicted by 3D detectors on the training set. Then we evaluate Adv3D on the unseen validation set and show that it can cause a large performance reduction when rendering NeRF in any sampled pose. To generate physically realizable adversarial examples, we propose primitive-aware sampling and semantic-guided regularization that enable 3D patch attacks with camouflage adversarial texture. Experimental results demonstrate that the trained adversarial NeRF generalizes well to different poses, scenes, and 3D detectors. Finally, we provide a defense method to our attacks that involves adversarial training through data augmentation. Project page: https://len-li.github.io/adv3d-web

Adv3D: Generating 3D Adversarial Examples for 3D Object Detection in Driving Scenarios with NeRF

TL;DR

Adv3D tackles the vulnerability of camera-based 3D object detectors in autonomous driving to physically realizable adversarial textures. It introduces a NeRF-based adversarial framework that renders photorealistic 3D patches and optimizes their textures under an Expectation Over Transformation (EOT) regime, with primitive-aware sampling, disentangled texture/shape modeling, and semantic-guided camouflage to ensure realism and transferability across poses and scenes. The approach demonstrates significant attack effectiveness across multiple detectors on the nuScenes dataset, while also proposing a defensible training strategy via data augmentation that partially restores detector performance. The work highlights a practical threat model and provides concrete methods to both evaluate robustness and guide the design of more resilient 3D perception systems for real-world driving.

Abstract

Deep neural networks (DNNs) have been proven extremely susceptible to adversarial examples, which raises special safety-critical concerns for DNN-based autonomous driving stacks (i.e., 3D object detection). Although there are extensive works on image-level attacks, most are restricted to 2D pixel spaces, and such attacks are not always physically realistic in our 3D world. Here we present Adv3D, the first exploration of modeling adversarial examples as Neural Radiance Fields (NeRFs). Advances in NeRF provide photorealistic appearances and 3D accurate generation, yielding a more realistic and realizable adversarial example. We train our adversarial NeRF by minimizing the surrounding objects' confidence predicted by 3D detectors on the training set. Then we evaluate Adv3D on the unseen validation set and show that it can cause a large performance reduction when rendering NeRF in any sampled pose. To generate physically realizable adversarial examples, we propose primitive-aware sampling and semantic-guided regularization that enable 3D patch attacks with camouflage adversarial texture. Experimental results demonstrate that the trained adversarial NeRF generalizes well to different poses, scenes, and 3D detectors. Finally, we provide a defense method to our attacks that involves adversarial training through data augmentation. Project page: https://len-li.github.io/adv3d-web
Paper Structure (24 sections, 7 equations, 4 figures, 5 tables)

This paper contains 24 sections, 7 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Adv3D aims to generate 3D adversarial examples that consistently perform attacks under different poses during rendering. We initialize adversarial examples from Lift3D lift3D2023CVPR. During training, we optimize the texture latent codes of NeRF to minimize the detection confidence of all surrounding objects. During inference, we evaluate the performance reduction of pasting the adversarial patch rendered using randomly sampled poses on the validation set.
  • Figure 2: Rendered results of adversarial examples. (a) Image and semantic label of an instance predicted by NeRF. (b) Top: our example without semantic-guided regularization. Bottom: our example with semantic-guided regularization. (c) Multi-view consistent synthesis of our examples. (d,e) The texture transfer results of side and back part adversary to other vehicles.
  • Figure 3: Visualization of BEVDet prediction on nuScenes validation set under our attacks. The visualization threshold is set at $0.6$. The adversarial NeRF can hide surrounding objects by minimizing their predicted confidence in a non-contact manner (making the yellow boxes disappear). Lidar point clouds are only used for visualization.
  • Figure 4: To examine the 3D-aware property of our adversarial examples, we ablate the relative performance drop by sampling adversarial examples within different bins of location and rotation.