Table of Contents
Fetching ...

Promptable Closed-loop Traffic Simulation

Shuhan Tan, Boris Ivanovic, Yuxiao Chen, Boyi Li, Xinshuo Weng, Yulong Cao, Philipp Krähenbühl, Marco Pavone

TL;DR

ProSim tackles the problem of controllable, realistic traffic simulation for autonomous driving by introducing a promptable closed-loop framework that uses a scene Encoder, a multimodal Generator (with LLM support), and a per-agent Policy to realize interactive rollouts conditioned on per-agent prompts. It demonstrates high realism and controllability across multimodal prompts and provides ProSim-Instruct-520k, a large dataset pairing real-world driving scenes with prompts and labels. The approach achieves competitive performance on Waymo’s open-sourced evaluation without prompts and enables efficient, scalable rollout with 50 ms per scenario on modern GPUs. The resource aims to accelerate research in promptable traffic simulation and safer AV system development.

Abstract

Simulation stands as a cornerstone for safe and efficient autonomous driving development. At its core a simulation system ought to produce realistic, reactive, and controllable traffic patterns. In this paper, we propose ProSim, a multimodal promptable closed-loop traffic simulation framework. ProSim allows the user to give a complex set of numerical, categorical or textual prompts to instruct each agent's behavior and intention. ProSim then rolls out a traffic scenario in a closed-loop manner, modeling each agent's interaction with other traffic participants. Our experiments show that ProSim achieves high prompt controllability given different user prompts, while reaching competitive performance on the Waymo Sim Agents Challenge when no prompt is given. To support research on promptable traffic simulation, we create ProSim-Instruct-520k, a multimodal prompt-scenario paired driving dataset with over 10M text prompts for over 520k real-world driving scenarios. We will release code of ProSim as well as data and labeling tools of ProSim-Instruct-520k at https://ariostgx.github.io/ProSim.

Promptable Closed-loop Traffic Simulation

TL;DR

ProSim tackles the problem of controllable, realistic traffic simulation for autonomous driving by introducing a promptable closed-loop framework that uses a scene Encoder, a multimodal Generator (with LLM support), and a per-agent Policy to realize interactive rollouts conditioned on per-agent prompts. It demonstrates high realism and controllability across multimodal prompts and provides ProSim-Instruct-520k, a large dataset pairing real-world driving scenes with prompts and labels. The approach achieves competitive performance on Waymo’s open-sourced evaluation without prompts and enables efficient, scalable rollout with 50 ms per scenario on modern GPUs. The resource aims to accelerate research in promptable traffic simulation and safer AV system development.

Abstract

Simulation stands as a cornerstone for safe and efficient autonomous driving development. At its core a simulation system ought to produce realistic, reactive, and controllable traffic patterns. In this paper, we propose ProSim, a multimodal promptable closed-loop traffic simulation framework. ProSim allows the user to give a complex set of numerical, categorical or textual prompts to instruct each agent's behavior and intention. ProSim then rolls out a traffic scenario in a closed-loop manner, modeling each agent's interaction with other traffic participants. Our experiments show that ProSim achieves high prompt controllability given different user prompts, while reaching competitive performance on the Waymo Sim Agents Challenge when no prompt is given. To support research on promptable traffic simulation, we create ProSim-Instruct-520k, a multimodal prompt-scenario paired driving dataset with over 10M text prompts for over 520k real-world driving scenarios. We will release code of ProSim as well as data and labeling tools of ProSim-Instruct-520k at https://ariostgx.github.io/ProSim.
Paper Structure (33 sections, 5 equations, 6 figures, 6 tables)

This paper contains 33 sections, 5 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Overview of ProSim and promptable closed-loop traffic simulation. All agents are controlled by ProSim: green ones are unconditioned and others are prompted with multimodal prompts.
  • Figure 2: Prompt quantity analysis.
  • Figure 3: ProSim can condition on a variety of prompt types. All agents are controlled by ProSim.
  • Figure A1: Language condition encoder for Generator.
  • Figure A2: Interface used by human labeler for Quality Assurance.
  • ...and 1 more figures