ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
Guosheng Zhao, Xiaofeng Wang, Chaojun Ni, Zheng Zhu, Wenkang Qin, Guan Huang, Xingang Wang
TL;DR
ReconDreamer++ tackles the domain gap between generative and real observations in driving-scene reconstruction by introducing the Novel Trajectory Deformable Network (NTDNet) to learn spatial deformations and by modeling the ground surface with fixed 3D Gaussian priors. This three-component scene representation (ground, non-ground background, dynamic objects) paired with depth-guided optimization and selective deformation for novel trajectories yields substantial improvements over prior methods, especially on novel viewpoints. Across Waymo, nuScenes, PandaSet, and EUVS, ReconDreamer++ achieves higher fidelity in both foreground and road-surface rendering, with notable gains in NTA-IoU, NTL-IoU, and FID, demonstrating robustness to large trajectory shifts. The work advances closed-loop autonomous driving simulation by enhancing geometric consistency of structured elements and reducing cumulative errors between reconstruction and generative models, enabling more reliable scene synthesis for planning and evaluation.
Abstract
Combining reconstruction models with generative models has emerged as a promising paradigm for closed-loop simulation in autonomous driving. For example, ReconDreamer has demonstrated remarkable success in rendering large-scale maneuvers. However, a significant gap remains between the generated data and real-world sensor observations, particularly in terms of fidelity for structured elements, such as the ground surface. To address these challenges, we propose ReconDreamer++, an enhanced framework that significantly improves the overall rendering quality by mitigating the domain gap and refining the representation of the ground surface. Specifically, ReconDreamer++ introduces the Novel Trajectory Deformable Network (NTDNet), which leverages learnable spatial deformation mechanisms to bridge the domain gap between synthesized novel views and original sensor observations. Moreover, for structured elements such as the ground surface, we preserve geometric prior knowledge in 3D Gaussians, and the optimization process focuses on refining appearance attributes while preserving the underlying geometric structure. Experimental evaluations conducted on multiple datasets (Waymo, nuScenes, PandaSet, and EUVS) confirm the superior performance of ReconDreamer++. Specifically, on Waymo, ReconDreamer++ achieves performance comparable to Street Gaussians for the original trajectory while significantly outperforming ReconDreamer on novel trajectories. In particular, it achieves substantial improvements, including a 6.1% increase in NTA-IoU, a 23. 0% improvement in FID, and a remarkable 4.5% gain in the ground surface metric NTL-IoU, highlighting its effectiveness in accurately reconstructing structured elements such as the road surface.
