Table of Contents
Fetching ...

Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving

Junyi Cao, Zhichao Li, Naiyan Wang, Chao Ma

TL;DR

Lightning NeRF significantly improves the novel view synthesis performance of NeRF and reduces computational overheads and achieves a five-fold increase in training speed and a ten-fold improvement in rendering speed.

Abstract

Recent studies have highlighted the promising application of NeRF in autonomous driving contexts. However, the complexity of outdoor environments, combined with the restricted viewpoints in driving scenarios, complicates the task of precisely reconstructing scene geometry. Such challenges often lead to diminished quality in reconstructions and extended durations for both training and rendering. To tackle these challenges, we present Lightning NeRF. It uses an efficient hybrid scene representation that effectively utilizes the geometry prior from LiDAR in autonomous driving scenarios. Lightning NeRF significantly improves the novel view synthesis performance of NeRF and reduces computational overheads. Through evaluations on real-world datasets, such as KITTI-360, Argoverse2, and our private dataset, we demonstrate that our approach not only exceeds the current state-of-the-art in novel view synthesis quality but also achieves a five-fold increase in training speed and a ten-fold improvement in rendering speed. Codes are available at https://github.com/VISION-SJTU/Lightning-NeRF .

Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving

TL;DR

Lightning NeRF significantly improves the novel view synthesis performance of NeRF and reduces computational overheads and achieves a five-fold increase in training speed and a ten-fold improvement in rendering speed.

Abstract

Recent studies have highlighted the promising application of NeRF in autonomous driving contexts. However, the complexity of outdoor environments, combined with the restricted viewpoints in driving scenarios, complicates the task of precisely reconstructing scene geometry. Such challenges often lead to diminished quality in reconstructions and extended durations for both training and rendering. To tackle these challenges, we present Lightning NeRF. It uses an efficient hybrid scene representation that effectively utilizes the geometry prior from LiDAR in autonomous driving scenarios. Lightning NeRF significantly improves the novel view synthesis performance of NeRF and reduces computational overheads. Through evaluations on real-world datasets, such as KITTI-360, Argoverse2, and our private dataset, we demonstrate that our approach not only exceeds the current state-of-the-art in novel view synthesis quality but also achieves a five-fold increase in training speed and a ten-fold improvement in rendering speed. Codes are available at https://github.com/VISION-SJTU/Lightning-NeRF .
Paper Structure (15 sections, 11 equations, 5 figures, 7 tables)

This paper contains 15 sections, 11 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Training efficiency. These curves reflect the trend of training PSNR with time. The values are obtained on sequence 133e2e0b of Argoverse2 wilson2023argoverse.
  • Figure 2: Overview of the proposed framework. The red and green boxes represent the foreground and background in our proposed scene representation, respectively. Given point cloud data from LiDAR observations, we first use LiDAR Initialization to initialize the scene geometry (see Sec. \ref{['sec:LI']}). Then, we query the volume density $\sigma$ and the color embedding feature $\boldsymbol{f}$ of each sample point along a ray from the voxel grids (see Sec. \ref{['sec:HSR']}). We adopt separate MLPs for modeling view-dependent (with the viewing direction $\boldsymbol{d}$ as an additional input) and view-independent colors. Combining the two components achieves the final rendered image (see Sec. \ref{['sec:color_decomp']}).
  • Figure 3: Visual results in the extrapolation setting with or without the proposed color decomposition (CD). Best viewed in color with zoomed-in.
  • Figure 4: Qualitative results on the Argoverse2. The first row shows rendered images and the second row shows depth maps. Best viewed in color.
  • Figure 5: Extrapolation results on the Argoverse2. Based on the reference view, we moved the camera 2 meters to the left.