Table of Contents
Fetching ...

ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration

Chaojun Ni, Guosheng Zhao, Xiaofeng Wang, Zheng Zhu, Wenkang Qin, Guan Huang, Chen Liu, Yuyin Chen, Yida Wang, Xueyang Zhang, Yifei Zhan, Kun Zhan, Peng Jia, Xianpeng Lang, Xingang Wang, Wenjun Mei

TL;DR

ReconDreamer tackles the challenge of rendering novel driving trajectories in dynamic scene reconstruction by continuously integrating world-model priors through online restoration and progressive data updates. The method introduces DriveRestorer to mitigate ghosting artifacts and a Progressive Data Update Strategy (PDUS) to steadily incorporate restored novel-trajectory data into training, enabling high-fidelity rendering for large maneuvers up to 6 meters. Empirical results show substantial gains over Street Gaussians and DriveDreamer4D across NTA-IoU, NTL-IoU, and FID, with strong user-study support for improved visual quality and spatiotemporal coherence. Overall, ReconDreamer enables robust closed-loop driving simulations by ensuring accurate viewpoint rendering and coherent scene elements under challenging maneuvers, offering practical impact for autonomous driving evaluation and development.

Abstract

Closed-loop simulation is crucial for end-to-end autonomous driving. Existing sensor simulation methods (e.g., NeRF and 3DGS) reconstruct driving scenes based on conditions that closely mirror training data distributions. However, these methods struggle with rendering novel trajectories, such as lane changes. Recent works have demonstrated that integrating world model knowledge alleviates these issues. Despite their efficiency, these approaches still encounter difficulties in the accurate representation of more complex maneuvers, with multi-lane shifts being a notable example. Therefore, we introduce ReconDreamer, which enhances driving scene reconstruction through incremental integration of world model knowledge. Specifically, DriveRestorer is proposed to mitigate artifacts via online restoration. This is complemented by a progressive data update strategy designed to ensure high-quality rendering for more complex maneuvers. To the best of our knowledge, ReconDreamer is the first method to effectively render in large maneuvers. Experimental results demonstrate that ReconDreamer outperforms Street Gaussians in the NTA-IoU, NTL-IoU, and FID, with relative improvements by 24.87%, 6.72%, and 29.97%. Furthermore, ReconDreamer surpasses DriveDreamer4D with PVG during large maneuver rendering, as verified by a relative improvement of 195.87% in the NTA-IoU metric and a comprehensive user study.

ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration

TL;DR

ReconDreamer tackles the challenge of rendering novel driving trajectories in dynamic scene reconstruction by continuously integrating world-model priors through online restoration and progressive data updates. The method introduces DriveRestorer to mitigate ghosting artifacts and a Progressive Data Update Strategy (PDUS) to steadily incorporate restored novel-trajectory data into training, enabling high-fidelity rendering for large maneuvers up to 6 meters. Empirical results show substantial gains over Street Gaussians and DriveDreamer4D across NTA-IoU, NTL-IoU, and FID, with strong user-study support for improved visual quality and spatiotemporal coherence. Overall, ReconDreamer enables robust closed-loop driving simulations by ensuring accurate viewpoint rendering and coherent scene elements under challenging maneuvers, offering practical impact for autonomous driving evaluation and development.

Abstract

Closed-loop simulation is crucial for end-to-end autonomous driving. Existing sensor simulation methods (e.g., NeRF and 3DGS) reconstruct driving scenes based on conditions that closely mirror training data distributions. However, these methods struggle with rendering novel trajectories, such as lane changes. Recent works have demonstrated that integrating world model knowledge alleviates these issues. Despite their efficiency, these approaches still encounter difficulties in the accurate representation of more complex maneuvers, with multi-lane shifts being a notable example. Therefore, we introduce ReconDreamer, which enhances driving scene reconstruction through incremental integration of world model knowledge. Specifically, DriveRestorer is proposed to mitigate artifacts via online restoration. This is complemented by a progressive data update strategy designed to ensure high-quality rendering for more complex maneuvers. To the best of our knowledge, ReconDreamer is the first method to effectively render in large maneuvers. Experimental results demonstrate that ReconDreamer outperforms Street Gaussians in the NTA-IoU, NTL-IoU, and FID, with relative improvements by 24.87%, 6.72%, and 29.97%. Furthermore, ReconDreamer surpasses DriveDreamer4D with PVG during large maneuver rendering, as verified by a relative improvement of 195.87% in the NTA-IoU metric and a comprehensive user study.

Paper Structure

This paper contains 16 sections, 7 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: Dynamic driving scene reconstruction methods, such as DriveDreamer4D drivedreamer4d and Street Gaussians streetgaussian, encounter significant challenges when rendering larger maneuvers (e.g., multi-lane shifts). In contrast, the proposed ReconDreamer significantly improves rendering quality via incrementally integrating world model knowledge.
  • Figure 2: The overall framework of ReconDreamer. During the training of the dynamic driving scene reconstruction, we begin by rendering novel trajectory views. These rendered videos are subsequently processed by the DriveRestorer to restore their quality. Then these restored videos, together with the original video, are employed to optimize the reconstruction model. This iterative process continues until the reconstruction model converges (The training of DriveRestorer is omitted in this figure, and more details are in Sec. \ref{['sec:driveres']}).
  • Figure 3: The restoration dataset construction for training DriveRestorer. Initially, we train an under-trained Gaussian Splatting model utilizing the ground truth (GT) videos from the original trajectory. During training of Gaussian Splatting model, degraded videos of the original trajectory are rendered at each stage. These degraded videos, paired with their corresponding GT videos, form the restoration dataset. A mask is then applied to the degraded videos to train DriveRestorer, supervised by the GT videos.
  • Figure 4: Restoration data pairs for DriveRestorer training.
  • Figure 5: Examples of degraded video frame rendered under new trajectories and their restored frame by DriveRestorer.
  • ...and 4 more figures