Table of Contents
Fetching ...

SynAD: Enhancing Real-World End-to-End Autonomous Driving Models through Synthetic Data Integration

Jongsuk Kim, Jaeyoung Lee, Gyojin Han, Dongjae Lee, Minki Jeong, Junmo Kim

TL;DR

SynAD introduces ego-centric synthetic data integration for real-world end-to-end autonomous driving. It couples ego-centric scenario generation via guided diffusion with a Map-to-BEV network that encodes BEV features from maps without sensor inputs, and it trains E2E AD by selectively merging synthetic-map BEV with real BEV data across motion forecasting and planning. The approach yields safer driving behavior with lower collision rates while maintaining competitive motion prediction and occupancy understanding, validated on nuScenes with ablations confirming each component’s contribution. By bridging synthetic scenario generation and real-world E2E AD pipelines, SynAD enables more diverse, robust training without relying on additional sensor data during inference, broadening practical deployment potential.

Abstract

Recent advancements in deep learning and the availability of high-quality real-world driving datasets have propelled end-to-end autonomous driving. Despite this progress, relying solely on real-world data limits the variety of driving scenarios for training. Synthetic scenario generation has emerged as a promising solution to enrich the diversity of training data; however, its application within E2E AD models remains largely unexplored. This is primarily due to the absence of a designated ego vehicle and the associated sensor inputs, such as camera or LiDAR, typically provided in real-world scenarios. To address this gap, we introduce SynAD, the first framework designed to enhance real-world E2E AD models using synthetic data. Our method designates the agent with the most comprehensive driving information as the ego vehicle in a multi-agent synthetic scenario. We further project path-level scenarios onto maps and employ a newly developed Map-to-BEV Network to derive bird's-eye-view features without relying on sensor inputs. Finally, we devise a training strategy that effectively integrates these map-based synthetic data with real driving data. Experimental results demonstrate that SynAD effectively integrates all components and notably enhances safety performance. By bridging synthetic scenario generation and E2E AD, SynAD paves the way for more comprehensive and robust autonomous driving models.

SynAD: Enhancing Real-World End-to-End Autonomous Driving Models through Synthetic Data Integration

TL;DR

SynAD introduces ego-centric synthetic data integration for real-world end-to-end autonomous driving. It couples ego-centric scenario generation via guided diffusion with a Map-to-BEV network that encodes BEV features from maps without sensor inputs, and it trains E2E AD by selectively merging synthetic-map BEV with real BEV data across motion forecasting and planning. The approach yields safer driving behavior with lower collision rates while maintaining competitive motion prediction and occupancy understanding, validated on nuScenes with ablations confirming each component’s contribution. By bridging synthetic scenario generation and real-world E2E AD pipelines, SynAD enables more diverse, robust training without relying on additional sensor data during inference, broadening practical deployment potential.

Abstract

Recent advancements in deep learning and the availability of high-quality real-world driving datasets have propelled end-to-end autonomous driving. Despite this progress, relying solely on real-world data limits the variety of driving scenarios for training. Synthetic scenario generation has emerged as a promising solution to enrich the diversity of training data; however, its application within E2E AD models remains largely unexplored. This is primarily due to the absence of a designated ego vehicle and the associated sensor inputs, such as camera or LiDAR, typically provided in real-world scenarios. To address this gap, we introduce SynAD, the first framework designed to enhance real-world E2E AD models using synthetic data. Our method designates the agent with the most comprehensive driving information as the ego vehicle in a multi-agent synthetic scenario. We further project path-level scenarios onto maps and employ a newly developed Map-to-BEV Network to derive bird's-eye-view features without relying on sensor inputs. Finally, we devise a training strategy that effectively integrates these map-based synthetic data with real driving data. Experimental results demonstrate that SynAD effectively integrates all components and notably enhances safety performance. By bridging synthetic scenario generation and E2E AD, SynAD paves the way for more comprehensive and robust autonomous driving models.

Paper Structure

This paper contains 49 sections, 32 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Conceptual illustration of SynAD. During training, both real and synthetic data are used to generate BEV and MapBEV features for the E2E AD model, while only real data is used during testing to ensure practical applicability.
  • Figure 2: Overview of SynAD. We generate synthetic multi-agent scenarios and convert them into ego-centric map representations $x_\text{SM}$, while real scenarios are similarly projected as $x_\text{RM}$. To train Map-to-BEV Network, we use paired data from $x_\text{RM}$ and $x_I$, ensuring that Map-to-BEV Network produces BEV feature consistent with the output of pretrained BEVFormer applied to multi-camera images. The synthetic scenario $x_\text{SM}$ can be converted into BEV feature $B_\text{SM}$ without any multi-camera images using our novel Map-to-BEV network. In the final E2E AD framework, we selectively apply BEV features only to modules that benefit most, thereby improving overall performance.
  • Figure 3: Examples of $x_\text{SM}$ over time. White box indicates the ego vehicle, while orange boxes denote other vehicles. The synthetic scenarios are conditioned on the existing map representation, then projected using vehicle states and size information.
  • Figure 4: Overview of the Map-to-BEV training. We freeze pre-trained BEVFormer and align $B_\text{RM}$ with $B_I$, enabling the network to generate BEV representations without sensor inputs.
  • Figure 5: Qualitative result of SynAD. The performance of SynAD in an urban driving scenario is presented through six views capturing the surroundings. The front and back vehicles' motion forecasting are visualized with color-coded trajectories, where warmer colors (red) indicate more immediate movements and cooler colors (blue) represent later positions.
  • ...and 1 more figures