Neural Rendering based Urban Scene Reconstruction for Autonomous Driving
Shihao Shen, Louis Kerofsky, Varun Ravi Kumar, Senthil Yogamani
TL;DR
This paper tackles dense, accurate reconstruction of urban scenes for autonomous driving by uniting neural implicit surfaces with radiance fields to produce dense geometry and renderings from multimodal sensor data. It introduces a foreground-background decomposition, a dynamic-object filtering strategy based on 3D detections, and a divide-and-conquer training scheme to scale to large environments, all supervised by photometric, Eikonal, and LiDAR-derived geometry losses. The results show that incorporating LiDAR improves depth and geometry accuracy (e.g., PSNR from 26.211 to 31.993 and RMSE from 9.243 to 5.243 in the reported setup) and that dynamic object filtering reduces artifacts, enabling more reliable urban scene reconstruction. Overall, the method enables scalable, high-fidelity neural scene representations suitable for online annotation, data augmentation, and offline perception pipelines in autonomous driving.
Abstract
Dense 3D reconstruction has many applications in automated driving including automated annotation validation, multimodal data augmentation, providing ground truth annotations for systems lacking LiDAR, as well as enhancing auto-labeling accuracy. LiDAR provides highly accurate but sparse depth, whereas camera images enable estimation of dense depth but noisy particularly at long ranges. In this paper, we harness the strengths of both sensors and propose a multimodal 3D scene reconstruction using a framework combining neural implicit surfaces and radiance fields. In particular, our method estimates dense and accurate 3D structures and creates an implicit map representation based on signed distance fields, which can be further rendered into RGB images, and depth maps. A mesh can be extracted from the learned signed distance field and culled based on occlusion. Dynamic objects are efficiently filtered on the fly during sampling using 3D object detection models. We demonstrate qualitative and quantitative results on challenging automotive scenes.
