Table of Contents
Fetching ...

INGeo: Accelerating Instant Neural Scene Reconstruction with Noisy Geometry Priors

Chaojian Li, Bichen Wu, Albert Pumarola, Peizhao Zhang, Yingyan Celine Lin, Peter Vajda

TL;DR

INGeo addresses the challenge of instant neural scene reconstruction on edge devices by leveraging geometry priors converted into occupancy grids to guide a grid-based NeRF representation built on Instant-NGP. It introduces three noise-mitigation strategies—density scaling, point-cloud splatting, and updating occupancy grids—to cope with imperfect priors. The approach achieves roughly a twofold training speedup and reaches an average PSNR above $30$ on NeRF-Synthetic with half the training iterations on an embedded GPU, while preserving quality across budgets. This work pushes toward practical, instant reconstruction for on-device AR/VR applications.

Abstract

We present a method that accelerates reconstruction of 3D scenes and objects, aiming to enable instant reconstruction on edge devices such as mobile phones and AR/VR headsets. While recent works have accelerated scene reconstruction training to minute/second-level on high-end GPUs, there is still a large gap to the goal of instant training on edge devices which is yet highly desired in many emerging applications such as immersive AR/VR. To this end, this work aims to further accelerate training by leveraging geometry priors of the target scene. Our method proposes strategies to alleviate the noise of the imperfect geometry priors to accelerate the training speed on top of the highly optimized Instant-NGP. On the NeRF Synthetic dataset, our work uses half of the training iterations to reach an average test PSNR of >30.

INGeo: Accelerating Instant Neural Scene Reconstruction with Noisy Geometry Priors

TL;DR

INGeo addresses the challenge of instant neural scene reconstruction on edge devices by leveraging geometry priors converted into occupancy grids to guide a grid-based NeRF representation built on Instant-NGP. It introduces three noise-mitigation strategies—density scaling, point-cloud splatting, and updating occupancy grids—to cope with imperfect priors. The approach achieves roughly a twofold training speedup and reaches an average PSNR above on NeRF-Synthetic with half the training iterations on an embedded GPU, while preserving quality across budgets. This work pushes toward practical, instant reconstruction for on-device AR/VR applications.

Abstract

We present a method that accelerates reconstruction of 3D scenes and objects, aiming to enable instant reconstruction on edge devices such as mobile phones and AR/VR headsets. While recent works have accelerated scene reconstruction training to minute/second-level on high-end GPUs, there is still a large gap to the goal of instant training on edge devices which is yet highly desired in many emerging applications such as immersive AR/VR. To this end, this work aims to further accelerate training by leveraging geometry priors of the target scene. Our method proposes strategies to alleviate the noise of the imperfect geometry priors to accelerate the training speed on top of the highly optimized Instant-NGP. On the NeRF Synthetic dataset, our work uses half of the training iterations to reach an average test PSNR of >30.
Paper Structure (9 sections, 6 figures, 1 table)

This paper contains 9 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: INGeo accelerates neural reconstruction training by $\sim$2$\times$ over the current SotA -- Instant-NGP muller2022instant.
  • Figure 2: Visualization of the pretrained occupancy grid in (a) suggests the reason why it can accelerate the training process by $\sim$ 2$\times$ in (b) is that the occupancy grid can eliminate spatial redundancy in training.
  • Figure 3: Visualization of the point-cloud obtained by COLMAP schoenberger2016sfm and the converted occupancy grid.
  • Figure 4: Comparing the effectiveness of geometry priors (a) w/o and (b) w/ the proposed density scaling on the Lego scene mildenhall2020nerf. Compare their training speed curves, rendered images at 50 iterations, and density distribution along a ray.
  • Figure 5: Compare (a) w/o (i.e., only w/ the proposed density scaling) and (b) w/ the proposed point-cloud splatting on Lego dataset mildenhall2020nerf in terms of the training efficiency over baselines (e.g., training from random initialization) and rendered images.
  • ...and 1 more figures