SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction

Zhenghao Peng; Yuxin Liu; Bolei Zhou

SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction

Zhenghao Peng, Yuxin Liu, Bolei Zhou

TL;DR

This work proposes SceneStreamer, a unified autoregressive framework for continuous scenario generation that represents the entire scene as a sequence of tokens, including traffic light signals, agent states, and motion vectors, and generates them step by step with a transformer model, enabling SceneStreamer to continuously introduce and retire agents over an unbounded horizon, supporting realistic long-duration simulation.

Abstract

Realistic and interactive traffic simulation is essential for training and evaluating autonomous driving systems. However, most existing data-driven simulation methods rely on static initialization or log-replay data, limiting their ability to model dynamic, long-horizon scenarios with evolving agent populations. We propose SceneStreamer, a unified autoregressive framework for continuous scenario generation that represents the entire scene as a sequence of tokens, including traffic light signals, agent states, and motion vectors, and generates them step by step with a transformer model. This design enables SceneStreamer to continuously introduce and retire agents over an unbounded horizon, supporting realistic long-duration simulation. Experiments demonstrate that SceneStreamer produces realistic, diverse, and adaptive traffic behaviors. Furthermore, reinforcement learning policies trained in SceneStreamer-generated scenarios achieve superior robustness and generalization, validating its utility as a high-fidelity simulation environment for autonomous driving. More information is available at https://vail-ucla.github.io/scenestreamer/ .

SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction

TL;DR

Abstract

SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)