DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous Driving
Hao Lu, Tianshuo Xu, Wenzhao Zheng, Yunpeng Zhang, Wei Zhan, Dalong Du, Masayoshi Tomizuka, Kurt Keutzer, Yingcong Chen
TL;DR
DrivingRecon addresses the challenge of fast, large-scale 4D reconstruction of driving scenes from surround-view videos. It introduces a feed-forward architecture that predicts 4D Gaussians, augmented by the Prune and Dilate Block (PD-Block) to adapt point distributions across views and complex edges, and by dynamic/static rendering with cross-temporal supervision. The method achieves superior reconstruction quality and novel view synthesis compared with state-of-the-art baselines, and demonstrates strong cross-scene generalization, as well as practical benefits for pre-training, vehicle adaptation, and scene editing. This work enables realistic driving scene synthesis and robust cross-domain transfer for downstream perception, planning, and simulation tasks.
Abstract
Photorealistic 4D reconstruction of street scenes is essential for developing real-world simulators in autonomous driving. However, most existing methods perform this task offline and rely on time-consuming iterative processes, limiting their practical applications. To this end, we introduce the Large 4D Gaussian Reconstruction Model (DrivingRecon), a generalizable driving scene reconstruction model, which directly predicts 4D Gaussian from surround view videos. To better integrate the surround-view images, the Prune and Dilate Block (PD-Block) is proposed to eliminate overlapping Gaussian points between adjacent views and remove redundant background points. To enhance cross-temporal information, dynamic and static decoupling is tailored to better learn geometry and motion features. Experimental results demonstrate that DrivingRecon significantly improves scene reconstruction quality and novel view synthesis compared to existing methods. Furthermore, we explore applications of DrivingRecon in model pre-training, vehicle adaptation, and scene editing. Our code is available at https://github.com/EnVision-Research/DriveRecon.
