NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

William Ljungbergh; Adam Tonderski; Joakim Johnander; Holger Caesar; Kalle Åström; Michael Felsberg; Christoffer Petersson

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

William Ljungbergh, Adam Tonderski, Joakim Johnander, Holger Caesar, Kalle Åström, Michael Felsberg, Christoffer Petersson

TL;DR

NeuroNCAP presents a NeRF-based photorealistic simulator for closed-loop safety testing of autonomous driving, learned from real-world sensor sequences and configurable to generate Euro NCAP-inspired safety scenarios. The framework combines a neural renderer, end-to-end AD models, a controller, and a vehicle dynamics model to create a four-step closed-loop loop that renders sensor data, predicts trajectories, applies controls, and propagates ego-state. Evaluation across stationary, frontal, and side collision scenarios reveals that state-of-the-art end-to-end planners often fail in safety-critical, closed-loop settings, even when perception appears robust, underscoring a gap between perception/prediction and planning. By releasing the simulator and a suite of safety-critical scenarios, NeuroNCAP provides a practical benchmark to stress-test and refine AD models, highlighting the need for safer, more robust end-to-end approaches and improved alignment between modules. The work also analyzes real-to-sim transfer gaps and discusses limitations, pointing to future work in expanding scenario diversity, physical realism, and neural rendering capabilities.

Abstract

We present a versatile NeRF-based simulator for testing autonomous driving (AD) software systems, designed with a focus on sensor-realistic closed-loop evaluation and the creation of safety-critical scenarios. The simulator learns from sequences of real-world driving sensor data and enables reconfigurations and renderings of new, unseen scenarios. In this work, we use our simulator to test the responses of AD models to safety-critical scenarios inspired by the European New Car Assessment Programme (Euro NCAP). Our evaluation reveals that, while state-of-the-art end-to-end planners excel in nominal driving scenarios in an open-loop setting, they exhibit critical flaws when navigating our safety-critical scenarios in a closed-loop setting. This highlights the need for advancements in the safety and real-world usability of end-to-end planners. By publicly releasing our simulator and scenarios as an easy-to-run evaluation suite, we invite the research community to explore, refine, and validate their AD models in controlled, yet highly configurable and challenging sensor-realistic environments. Code and instructions can be found at https://github.com/atonderski/neuro-ncap

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

TL;DR

Abstract

Paper Structure (12 sections, 2 equations, 5 figures, 3 tables)

This paper contains 12 sections, 2 equations, 5 figures, 3 tables.

Introduction
Related Work
Method
Closed-loop Simulator
Evaluation
Experiments
Experimental Setting
NeuroNCAP Results
Qualitative Results
Simulation Gap Study
Limitations
Conclusion

Figures (5)

Figure 1: The core idea in NeuroNCAP is to leverage NeRFs to realistically simulate many safety-critical scenarios from a sequence of real-world data. Here we show the original scenario, followed by examples of our three types of collision scenarios: stationary, frontal, and side. The inserted safety-critical actor has been highlighted for illustration purposes. We can generate hundreds of unique scenarios from each log by selecting different actors, jittering their trajectories, and choosing different starting conditions for the ego vehicle. Note that scenarios are not pre-generated, but rather obtained by iteratively generating new images, computing a plan, and acting upon said plan.
Figure 2: Our closed-loop simulation engine comprises four parts. First, given a driving log, a neural renderer (NeRF) provides photo-realistic images given the ego-vehicle state. Second, an AD model (e.g., the end-to-end planner UniAD hu2023planning) uses these to predict a future ego-trajectory. Third, a controller estimates acceleration and steering signals. Finally, a vehicle model propagates the ego-vehicle state one step into the future. This process is then iterated to achieve closed-loop simulation. Blue indicates simulator, green indicates AD system.
Figure 3: Different scenario types used in the NeuroNCAP evaluation protocol. The planner is allowed perceptual input $t_{pre}$ seconds before the test starts in order to build temporal context. Once the test starts, at $t=t_{start}$, there are multiple actions that can lead to a successfully completed scenario e.g., harsh breaking or a steering maneuver. To increase the robustness of the test and allow for multiple runs, we introduce small random perturbations to the target actor. Note that this is an illustration and not the actual output of our renderer.
Figure 4: Qualitative examples of three NeuroNCAP scenarios, with projected planning output (green, before controller) and the actual designed future trajectory of the target actor (blue). In some cases the planner reacts successfully (a), does not react at all (b), or attempts to avoid collision but fails (c). Our simulator can accurately render complex actors (a), but sometimes exhibits unrealistic artifacts for very close objects (b) and (c).
Figure 5: UniAD perception and planning output for three different scenarios, with (right) and without (left) trajectory post-processing. Highlighting unsafe planning despite strong perception, as well as strengths and weaknesses of post-processing. The plot features ground truth objects (grey) and predicted objects (class dependent color), and their predicted future trajectories. Moreover, we show the ego-vehicle (black), its planned trajectory (black) and the reference trajectory it is steering towards (red).

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

TL;DR

Abstract

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (5)