Table of Contents
Fetching ...

DriveSceneGen: Generating Diverse and Realistic Driving Scenarios from Scratch

Shuo Sun, Zekai Gu, Tianchen Sun, Jiawei Sun, Chengran Yuan, Yuhang Han, Dongen Li, Marcelo H. Ang

TL;DR

DriveSceneGen tackles the shortage of diverse driving scenarios by learning from real-world data and generating complete scenes from scratch. It combines a diffusion-based generation stage to create a raster BEV representation of the initial scene, a graph-based vectorization stage to recover a coherent lane-map, and a simulation stage that uses multi-modal trajectory prediction to produce multiple plausible futures conditioned on the generated scene ($5k$ generated scenarios vs $70k$ ground-truth). The approach introduces the first end-to-end method to generate joint static maps and dynamic agents, supported by a BEV-to-graph vectorization pipeline and a repurposed multi-modal predictor for futures, and it demonstrates high fidelity and diversity on the Waymo Motion dataset. This work enables scalable, data-driven generation of driving scenarios for training, validation, and safety testing of autonomous systems, with potential applicability beyond robotics.

Abstract

Realistic and diverse traffic scenarios in large quantities are crucial for the development and validation of autonomous driving systems. However, owing to numerous difficulties in the data collection process and the reliance on intensive annotations, real-world datasets lack sufficient quantity and diversity to support the increasing demand for data. This work introduces DriveSceneGen, a data-driven driving scenario generation method that learns from the real-world driving dataset and generates entire dynamic driving scenarios from scratch. DriveSceneGen is able to generate novel driving scenarios that align with real-world data distributions with high fidelity and diversity. Experimental results on 5k generated scenarios highlight the generation quality, diversity, and scalability compared to real-world datasets. To the best of our knowledge, DriveSceneGen is the first method that generates novel driving scenarios involving both static map elements and dynamic traffic participants from scratch.

DriveSceneGen: Generating Diverse and Realistic Driving Scenarios from Scratch

TL;DR

DriveSceneGen tackles the shortage of diverse driving scenarios by learning from real-world data and generating complete scenes from scratch. It combines a diffusion-based generation stage to create a raster BEV representation of the initial scene, a graph-based vectorization stage to recover a coherent lane-map, and a simulation stage that uses multi-modal trajectory prediction to produce multiple plausible futures conditioned on the generated scene ( generated scenarios vs ground-truth). The approach introduces the first end-to-end method to generate joint static maps and dynamic agents, supported by a BEV-to-graph vectorization pipeline and a repurposed multi-modal predictor for futures, and it demonstrates high fidelity and diversity on the Waymo Motion dataset. This work enables scalable, data-driven generation of driving scenarios for training, validation, and safety testing of autonomous systems, with potential applicability beyond robotics.

Abstract

Realistic and diverse traffic scenarios in large quantities are crucial for the development and validation of autonomous driving systems. However, owing to numerous difficulties in the data collection process and the reliance on intensive annotations, real-world datasets lack sufficient quantity and diversity to support the increasing demand for data. This work introduces DriveSceneGen, a data-driven driving scenario generation method that learns from the real-world driving dataset and generates entire dynamic driving scenarios from scratch. DriveSceneGen is able to generate novel driving scenarios that align with real-world data distributions with high fidelity and diversity. Experimental results on 5k generated scenarios highlight the generation quality, diversity, and scalability compared to real-world datasets. To the best of our knowledge, DriveSceneGen is the first method that generates novel driving scenarios involving both static map elements and dynamic traffic participants from scratch.
Paper Structure (30 sections, 11 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 30 sections, 11 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: An illustration of a set of generated scenarios and its components at the various stages of DriveSceneGen pipeline.
  • Figure 2: An overview of the DriveSceneGen pipeline. The pipeline consists of two stages: a generation stage and a simulation stage. In the generation stage, a diffusion model is employed to generate a rasterized Birds-Eye-View (BEV) representation of the initial scene of the driving scenario, which is then decoded by a graph-based vectorization method. In the simulation stage, the vectorized representation of the scenario is consumed by a simulation network as the initial scene to predict multi-modal joint distributions of the generated agents' future trajectories, with each distribution representing a distinctly possible outcome from this same initial scene.
  • Figure 3: A visualization of the results obtained at each stage of the proposed vectorization method. The example feature map used represents a scene of 80$m\ \times$ 80$m$.
  • Figure 4: Qualitative evaluation of the raster initial scenes generated by DriveSceneGen. Compared to the ground truths, the generated samples show high fidelity in terms of road geometry, lane topology, and agents' initial state distribution. Meanwhile, the generated results demonstrate both inter-category and intra-category diversities.
  • Figure 5: Qualitative evaluation of the different possible future scenarios generated from the same initial scene. Comparing the scenarios in each column, each future scenario presents distinct agent behaviors starting from the same initial scene.