Table of Contents
Fetching ...

RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios

Wenhao Ding, Yulong Cao, Ding Zhao, Chaowei Xiao, Marco Pavone

TL;DR

RealGen addresses the need for controllable, data-efficient traffic scenario generation for autonomous vehicle evaluation by adopting a retrieval-augmented generation framework. It learns a latent scenario representation through a contrastive autoencoder and trains a combiner to fuse retrieved behaviors into new scenarios, enabling editing and composition of behaviors with user-specified maps and initial poses. The approach demonstrates competitive realism and strong controllability, including tag-driven and safety-critical crash scenario generation, and improves downstream trajectory prediction augmentation. By enabling external updates through retrieved exemplars and avoiding gradient-based search, RealGen offers a scalable path for customizable AV simulation.

Abstract

Simulation plays a crucial role in the development of autonomous vehicles (AVs) due to the potential risks associated with real-world testing. Although significant progress has been made in the visual aspects of simulators, generating complex behavior among agents remains a formidable challenge. It is not only imperative to ensure realism in the scenarios generated but also essential to incorporate preferences and conditions to facilitate controllable generation for AV training and evaluation. Traditional methods, mainly relying on memorizing the distribution of training datasets, often fall short in generating unseen scenarios. Inspired by the success of retrieval augmented generation in large language models, we present RealGen, a novel retrieval-based in-context learning framework for traffic scenario generation. RealGen synthesizes new scenarios by combining behaviors from multiple retrieved examples in a gradient-free way, which may originate from templates or tagged scenarios. This in-context learning framework endows versatile generative capabilities, including the ability to edit scenarios, compose various behaviors, and produce critical scenarios. Evaluations show that RealGen offers considerable flexibility and controllability, marking a new direction in the field of controllable traffic scenario generation. Check our project website for more information: https://realgen.github.io.

RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios

TL;DR

RealGen addresses the need for controllable, data-efficient traffic scenario generation for autonomous vehicle evaluation by adopting a retrieval-augmented generation framework. It learns a latent scenario representation through a contrastive autoencoder and trains a combiner to fuse retrieved behaviors into new scenarios, enabling editing and composition of behaviors with user-specified maps and initial poses. The approach demonstrates competitive realism and strong controllability, including tag-driven and safety-critical crash scenario generation, and improves downstream trajectory prediction augmentation. By enabling external updates through retrieved exemplars and avoiding gradient-based search, RealGen offers a scalable path for customizable AV simulation.

Abstract

Simulation plays a crucial role in the development of autonomous vehicles (AVs) due to the potential risks associated with real-world testing. Although significant progress has been made in the visual aspects of simulators, generating complex behavior among agents remains a formidable challenge. It is not only imperative to ensure realism in the scenarios generated but also essential to incorporate preferences and conditions to facilitate controllable generation for AV training and evaluation. Traditional methods, mainly relying on memorizing the distribution of training datasets, often fall short in generating unseen scenarios. Inspired by the success of retrieval augmented generation in large language models, we present RealGen, a novel retrieval-based in-context learning framework for traffic scenario generation. RealGen synthesizes new scenarios by combining behaviors from multiple retrieved examples in a gradient-free way, which may originate from templates or tagged scenarios. This in-context learning framework endows versatile generative capabilities, including the ability to edit scenarios, compose various behaviors, and produce critical scenarios. Evaluations show that RealGen offers considerable flexibility and controllability, marking a new direction in the field of controllable traffic scenario generation. Check our project website for more information: https://realgen.github.io.
Paper Structure (18 sections, 4 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 4 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Conventional methods make the model memorize the data distribution for generating. (b) In contrast, our method employs a retriever to query datasets (including external data obtained after training) and uses a generative model to generate scenarios by integrating the information from the retrieved scenarios.
  • Figure 2: Left: the training pipeline for the encoders and decoder aimed at learning latent embeddings of scenarios. A contrastive loss is applied to behavior embeddings to ensure invariance to absolute positions. Middle: the training pipeline for the combiner with frozen encoder and decoder parameters. We use K-Nearest Neighbors (KNN) to retrieve scenarios similar to a template scenario in the dataset and use the retrieved behaviors to reconstruct the template scenario. Right: the generation pipeline with a retriever and a generator.
  • Figure 3: Qualitative evaluation of similar and dissimilar scenarios calculated by our scenario embedding. Rectangles represent the initial poses of vehicles and the lines represent the future trajectories.
  • Figure 4: (a) Scene ID accuracy using the behavior embedding with difference distance metrics. (b) A matrix shows the Wasserstein distance between scenario segments, where each block contains the segments that belong to the same Scene ID.
  • Figure 5: Examples of tag-retrieved scenarios generated by RealGen for six different tags.
  • ...and 1 more figures