PC-NeRF: Parent-Child Neural Radiance Fields Using Sparse LiDAR Frames in Autonomous Driving Environments
Xiuzhong Hu, Guangming Xiong, Zheng Zang, Peng Jia, Yuxuan Han, Junyi Ma
TL;DR
This work tackles large-scale 3D scene reconstruction and novel LiDAR view synthesis under temporally sparse frames in autonomous driving. It introduces PC-NeRF, a hierarchical framework with parent NeRFs and child NeRFs that share a network and employ a multi-level scene representation to efficiently leverage sparse LiDAR data. A two-step depth inference process locates relevant child NeRFs via $AABB$ tests and then refines depth within the selected region using losses defined at the scene, segment, and point levels, including $\mathcal{L}_{ij}^{\mathrm{pd}}$, $\mathcal{L}_{ij}^{\mathrm{cf}}$, and $\mathcal{L}_{ij}^{\mathrm{cd}}$ with weights $\lambda_{\mathrm{pd}}$, $\lambda_{\mathrm{cf}}$, $\lambda_{\mathrm{cd}}$ and parameters $\varepsilon$, $\gamma$. Experiments on MaiCity and KITTI demonstrate that PC-NeRF achieves high-precision novel LiDAR view synthesis and 3D reconstruction with as little as one epoch of training, and ablations validate the effectiveness of the hierarchical partitioning and two-step depth inference for sparse data scenarios.
Abstract
Large-scale 3D scene reconstruction and novel view synthesis are vital for autonomous vehicles, especially utilizing temporally sparse LiDAR frames. However, conventional explicit representations remain a significant bottleneck towards representing the reconstructed and synthetic scenes at unlimited resolution. Although the recently developed neural radiance fields (NeRF) have shown compelling results in implicit representations, the problem of large-scale 3D scene reconstruction and novel view synthesis using sparse LiDAR frames remains unexplored. To bridge this gap, we propose a 3D scene reconstruction and novel view synthesis framework called parent-child neural radiance field (PC-NeRF). Based on its two modules, parent NeRF and child NeRF, the framework implements hierarchical spatial partitioning and multi-level scene representation, including scene, segment, and point levels. The multi-level scene representation enhances the efficient utilization of sparse LiDAR point cloud data and enables the rapid acquisition of an approximate volumetric scene representation. With extensive experiments, PC-NeRF is proven to achieve high-precision novel LiDAR view synthesis and 3D reconstruction in large-scale scenes. Moreover, PC-NeRF can effectively handle situations with sparse LiDAR frames and demonstrate high deployment efficiency with limited training epochs. Our approach implementation and the pre-trained models are available at https://github.com/biter0088/pc-nerf.
