DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

Tianyi Yan; Dongming Wu; Wencheng Han; Junpeng Jiang; Xia Zhou; Kun Zhan; Cheng-zhong Xu; Jianbing Shen

DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

Tianyi Yan, Dongming Wu, Wencheng Han, Junpeng Jiang, Xia Zhou, Kun Zhan, Cheng-zhong Xu, Jianbing Shen

TL;DR

DrivingSphere tackles the gap between open-loop, high-fidelity synthetic data and real-world, closed-loop autonomous driving evaluation by building a $4D$ occupancy-based driving world and rendering high-fidelity multi-view videos. It combines OccDreamer for static background generation, an actor bank for dynamic participants, and VideoDreamer with a dual-path encoding and ID-aware actor representations to maintain spatial-temporal coherence. The framework introduces an agent coordination loop enabling Ego and Environment Agents to interact in a continuous feedback cycle, and demonstrates superior visual fidelity, temporal consistency, and driving-performance metrics in open- and closed-loop tests on nuScenes. This approach reduces the simulation-to-real-world domain gap, providing a practical platform for validating and improving vision-based autonomous driving systems.

Abstract

Autonomous driving evaluation requires simulation environments that closely replicate actual road conditions, including real-world sensory data and responsive feedback loops. However, many existing simulations need to predict waypoints along fixed routes on public datasets or synthetic photorealistic data, \ie, open-loop simulation usually lacks the ability to assess dynamic decision-making. While the recent efforts of closed-loop simulation offer feedback-driven environments, they cannot process visual sensor inputs or produce outputs that differ from real-world data. To address these challenges, we propose DrivingSphere, a realistic and closed-loop simulation framework. Its core idea is to build 4D world representation and generate real-life and controllable driving scenarios. In specific, our framework includes a Dynamic Environment Composition module that constructs a detailed 4D driving world with a format of occupancy equipping with static backgrounds and dynamic objects, and a Visual Scene Synthesis module that transforms this data into high-fidelity, multi-view video outputs, ensuring spatial and temporal consistency. By providing a dynamic and realistic simulation environment, DrivingSphere enables comprehensive testing and validation of autonomous driving algorithms, ultimately advancing the development of more reliable autonomous cars. The benchmark will be publicly released.

DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

TL;DR

Abstract

DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)