Generating Transferable Adversarial Simulation Scenarios for Self-Driving via Neural Rendering

Yasasa Abeysirigoonawardena; Kevin Xie; Chuhan Chen; Salar Hosseini; Ruiting Chen; Ruiqi Wang; Florian Shkurti

Generating Transferable Adversarial Simulation Scenarios for Self-Driving via Neural Rendering

Yasasa Abeysirigoonawardena, Kevin Xie, Chuhan Chen, Salar Hosseini, Ruiting Chen, Ruiqi Wang, Florian Shkurti

TL;DR

The paper addresses the challenge of safely evaluating self-driving systems by automatically generating adversarial, 3D-consistent scenarios. It introduces a differentiable surrogate scene built from Neural Radiance Fields (NeRF) and formulates adversarial scenario generation as a high-dimensional optimal-control problem that perturbs object textures to maximize policy deviation. Using implicit differentiation (adjoint method) and differentiable rendering, it yields gradient-based, transferable attacks that transfer from the NeRF surrogate to real deployment in both simulation and real-world tests. Gradient-based attacks outperform random baselines and can reveal safety-critical failures, offering a scalable framework for automated AV evaluation and robustness testing. Limitations include reliance on differentiable policies and potential non-smooth optimization landscapes, with future work aiming to handle non-differentiable components and broader sim-to-real transfer improvements.

Abstract

Self-driving software pipelines include components that are learned from a significant number of training examples, yet it remains challenging to evaluate the overall system's safety and generalization performance. Together with scaling up the real-world deployment of autonomous vehicles, it is of critical importance to automatically find simulation scenarios where the driving policies will fail. We propose a method that efficiently generates adversarial simulation scenarios for autonomous driving by solving an optimal control problem that aims to maximally perturb the policy from its nominal trajectory. Given an image-based driving policy, we show that we can inject new objects in a neural rendering representation of the deployment scene, and optimize their texture in order to generate adversarial sensor inputs to the policy. We demonstrate that adversarial scenarios discovered purely in the neural renderer (surrogate scene) can often be successfully transferred to the deployment scene, without further optimization. We demonstrate this transfer occurs both in simulated and real environments, provided the learned surrogate scene is sufficiently close to the deployment scene.

Generating Transferable Adversarial Simulation Scenarios for Self-Driving via Neural Rendering

TL;DR

Abstract

Paper Structure (39 sections, 17 equations, 18 figures, 5 tables)

This paper contains 39 sections, 17 equations, 18 figures, 5 tables.

Introduction
Related Work
Background
Neural Rendering
Method
Differentiable Renderer
Adversarial Object Insertion
Gradient computation via implicit differentiation
Gradient-based Adversarial Attack
Experiments
Experimental Details
Evaluation Metrics
Experimental Results
Limitations
Conclusion
...and 24 more sections

Figures (18)

Figure 1: First-person-view (FPV) of our adversarial attack transfer to an RC car with overhead trajectory view on the right. Row 1: Unperturbed policy execution; Row 2: Random search texture attack; Row 3: Our adversarial attack directly transferred to the real deployment scene, without additional optimization; Row 4: Our adversarial attack discovered in the surrogate NeRF simulator.
Figure 2: Our method can be summarized in the four steps shown. (a) In the top left, we obtain posed images from the deployment scene which can be a simulator or the real world. (b) In the bottom left, we reconstruct a surrogate scene by fitting a NeRF to the posed images as a differentiable simulator and observe only minor perceptual gap. (c) Having the surrogate scene, we can insert objects, which are also represented as NeRFs, and attack their color fields to generate textural attacks. (d) The discovered adversarial objects are introduced back into the deployment scene.
Figure 3: A computation diagram of our algorithm for generating adversarial attacks. The inner driving loop consists of three components: the neural rendering model, the differentiable driving policy, and the differentiable kinematic car model. We inject the adversarial perturbation to the surrogate scene by composing the outputs of one or more neural object renderers (the single object case is shown above for simplicity) with the output of the neural scene renderer. The parameters of the object renderer(s) are optimized to maximize the deviation of the realized trajectory from the reference trajectory, while keeping the parameters of the driving policy and scene renderer frozen.
Figure 4: Base car on the left; random texture in the middle; adversarial texture on the right.
Figure 5: Selected overhead views and snapshots from adversarial deployment trajectories in the real world (top row: monitor displays adversarial texture discovered in NeRF), and in CARLA (bottom row: adversarial objects inserted in the simulator).
...and 13 more figures

Generating Transferable Adversarial Simulation Scenarios for Self-Driving via Neural Rendering

TL;DR

Abstract

Generating Transferable Adversarial Simulation Scenarios for Self-Driving via Neural Rendering

Authors

TL;DR

Abstract

Table of Contents

Figures (18)