Table of Contents
Fetching ...

HarmonicNeRF: Geometry-Informed Synthetic View Augmentation for 3D Scene Reconstruction in Driving Scenarios

Xiaochao Pan, Jiawei Yao, Hongrui Kou, Tong Wu, Canran Xiao

TL;DR

HarmonicNeRF tackles sparse-view 3D scene reconstruction for autonomous driving by augmenting NeRF with geometry-guided synthetic views. It uses proxy geometry to filter rays and spherical harmonics to model the radiance distribution at surface points, enabling plausible radiance for unseen viewpoints. The method is plug-and-play with sparse implicit surface reconstruction and demonstrates state-of-the-art performance on KITTI, Argoverse, and NuScenes for novel depth view synthesis and scene reconstruction. This improves reconstruction fidelity under real-world outdoor data constraints and could enhance perception pipelines in autonomous systems.

Abstract

In the realm of autonomous driving, achieving precise 3D reconstruction of the driving environment is critical for ensuring safety and effective navigation. Neural Radiance Fields (NeRF) have shown promise in creating highly detailed and accurate models of complex environments. However, the application of NeRF in autonomous driving scenarios encounters several challenges, primarily due to the sparsity of viewpoints inherent in camera trajectories and the constraints on data collection in unbounded outdoor scenes, which typically occur along predetermined paths. This limitation not only reduces the available scene information but also poses significant challenges for NeRF training, as the sparse and path-distributed observational data leads to under-representation of the scene's geometry. In this paper, we introduce HarmonicNeRF, a novel approach for outdoor self-supervised monocular scene reconstruction. HarmonicNeRF capitalizes on the strengths of NeRF and enhances surface reconstruction accuracy by augmenting the input space with geometry-informed synthetic views. This is achieved through the application of spherical harmonics to generate novel radiance values, taking into careful consideration the color observations from the limited available real-world views. Additionally, our method incorporates proxy geometry to effectively manage occlusion, generating radiance pseudo-labels that circumvent the limitations of traditional image-warping techniques, which often fail in sparse data conditions typical of autonomous driving environments. Extensive experiments conducted on the KITTI, Argoverse, and NuScenes datasets demonstrate our approach establishes new benchmarks in synthesizing novel depth views and reconstructing scenes, significantly outperforming existing methods. Project page: https://github.com/Jiawei-Yao0812/HarmonicNeRF

HarmonicNeRF: Geometry-Informed Synthetic View Augmentation for 3D Scene Reconstruction in Driving Scenarios

TL;DR

HarmonicNeRF tackles sparse-view 3D scene reconstruction for autonomous driving by augmenting NeRF with geometry-guided synthetic views. It uses proxy geometry to filter rays and spherical harmonics to model the radiance distribution at surface points, enabling plausible radiance for unseen viewpoints. The method is plug-and-play with sparse implicit surface reconstruction and demonstrates state-of-the-art performance on KITTI, Argoverse, and NuScenes for novel depth view synthesis and scene reconstruction. This improves reconstruction fidelity under real-world outdoor data constraints and could enhance perception pipelines in autonomous systems.

Abstract

In the realm of autonomous driving, achieving precise 3D reconstruction of the driving environment is critical for ensuring safety and effective navigation. Neural Radiance Fields (NeRF) have shown promise in creating highly detailed and accurate models of complex environments. However, the application of NeRF in autonomous driving scenarios encounters several challenges, primarily due to the sparsity of viewpoints inherent in camera trajectories and the constraints on data collection in unbounded outdoor scenes, which typically occur along predetermined paths. This limitation not only reduces the available scene information but also poses significant challenges for NeRF training, as the sparse and path-distributed observational data leads to under-representation of the scene's geometry. In this paper, we introduce HarmonicNeRF, a novel approach for outdoor self-supervised monocular scene reconstruction. HarmonicNeRF capitalizes on the strengths of NeRF and enhances surface reconstruction accuracy by augmenting the input space with geometry-informed synthetic views. This is achieved through the application of spherical harmonics to generate novel radiance values, taking into careful consideration the color observations from the limited available real-world views. Additionally, our method incorporates proxy geometry to effectively manage occlusion, generating radiance pseudo-labels that circumvent the limitations of traditional image-warping techniques, which often fail in sparse data conditions typical of autonomous driving environments. Extensive experiments conducted on the KITTI, Argoverse, and NuScenes datasets demonstrate our approach establishes new benchmarks in synthesizing novel depth views and reconstructing scenes, significantly outperforming existing methods. Project page: https://github.com/Jiawei-Yao0812/HarmonicNeRF
Paper Structure (13 sections, 10 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 13 sections, 10 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) illustrates the differing camera paths in both general surrounding scenes and specific driving scenarios. The camera trajectories in driving scenes are depicted as more linear and path-constrained, indicative of the typical movement patterns in autonomous driving data collection, as opposed to the more varied viewpoints found in general scenes. (b) shows our method's effectiveness in dealing with the challenges of sparse and dynamic driving environments. Our reconstruction in (c) demonstrates significantly clearer and more accurate geometries
  • Figure 2: Method Overview. We exploit the coarse geometry in radiance field training to guide its augmentation with sparse inputs. (left) For a surface point $\mathbf{v}$, we aggregate the color observations from all inputs to fit a spherical harmonics expansion, thus the pseudo labels for all the augmented rays passing through $\mathbf{v}$ can be obtained through querying the SH, (right) when generating augmented rays, we check their visibility and exclude those cannot be actually observed.
  • Figure 3: Visualization of radiance fitting. We project the radiance value of points distributed on the upper hemisphere into this circle to get a visualization of the fitted radiance distribution. The first row of images depicts the SH fitting results, while the second row showcases the outcomes of interpolation. It reveals that the SH fitting provides more informative radiance prediction results on the top line, while preserving the smoothness of distribution.
  • Figure 4: Visual comparison of scene reconstruction on the NuScenes dataset, contrasting HarmonicNeRF with MipNeRF-360 barron2022mip.
  • Figure 5: Qualitative comparison of 3D scene reconstructions from the KITTI dataset using different neural radiance field methods. The top row presents original images from diverse urban settings with varying levels of detail and complexity. HarmonicNeRF, which consistently provides the clearest and most accurate depictions, with crisp textures and fine details, effectively handling challenging lighting and occlusions, and showcasing a marked improvement in both the fidelity and photorealism of the reconstructed scenes.
  • ...and 1 more figures