Table of Contents
Fetching ...

BAA-NGP: Bundle-Adjusting Accelerated Neural Graphics Primitives

Sainan Liu, Shan Lin, Jingpei Lu, Alexey Supikov, Michael Yip

TL;DR

BAA-NGP addresses the problem of learning implicit neural representations from images with unknown camera poses by introducing inverted-sphere parameterization, multi-resolution hash encoding, and occupancy-grid based acceleration. The method jointly optimizes camera poses and the radiance field, aided by a novel coarse-to-fine training strategy that gradually leverages hash-encoded features. Empirical results on LLFF and Blender show 10–20x speedups over BARF with comparable pose accuracy and improved image quality, demonstrating practical applicability to robotics and real-time scene understanding. These contributions enable rapid, robust INR learning in unstructured, real-world data regimes, reducing the barrier to deploying NeRF-like models in time-sensitive robotic perception tasks.

Abstract

Implicit neural representations have become pivotal in robotic perception, enabling robots to comprehend 3D environments from 2D images. Given a set of camera poses and associated images, the models can be trained to synthesize novel, unseen views. To successfully navigate and interact in dynamic settings, robots require the understanding of their spatial surroundings driven by unassisted reconstruction of 3D scenes and camera poses from real-time video footage. Existing approaches like COLMAP and bundle-adjusting neural radiance field methods take hours to days to process due to the high computational demands of feature matching, dense point sampling, and training of a multi-layer perceptron structure with a large number of parameters. To address these challenges, we propose a framework called bundle-adjusting accelerated neural graphics primitives (BAA-NGP) which leverages accelerated sampling and hash encoding to expedite automatic pose refinement/estimation and 3D scene reconstruction. Experimental results demonstrate 10 to 20 x speed improvement compared to other bundle-adjusting neural radiance field methods without sacrificing the quality of pose estimation. The github repository can be found here https://github.com/IntelLabs/baa-ngp.

BAA-NGP: Bundle-Adjusting Accelerated Neural Graphics Primitives

TL;DR

BAA-NGP addresses the problem of learning implicit neural representations from images with unknown camera poses by introducing inverted-sphere parameterization, multi-resolution hash encoding, and occupancy-grid based acceleration. The method jointly optimizes camera poses and the radiance field, aided by a novel coarse-to-fine training strategy that gradually leverages hash-encoded features. Empirical results on LLFF and Blender show 10–20x speedups over BARF with comparable pose accuracy and improved image quality, demonstrating practical applicability to robotics and real-time scene understanding. These contributions enable rapid, robust INR learning in unstructured, real-world data regimes, reducing the barrier to deploying NeRF-like models in time-sensitive robotic perception tasks.

Abstract

Implicit neural representations have become pivotal in robotic perception, enabling robots to comprehend 3D environments from 2D images. Given a set of camera poses and associated images, the models can be trained to synthesize novel, unseen views. To successfully navigate and interact in dynamic settings, robots require the understanding of their spatial surroundings driven by unassisted reconstruction of 3D scenes and camera poses from real-time video footage. Existing approaches like COLMAP and bundle-adjusting neural radiance field methods take hours to days to process due to the high computational demands of feature matching, dense point sampling, and training of a multi-layer perceptron structure with a large number of parameters. To address these challenges, we propose a framework called bundle-adjusting accelerated neural graphics primitives (BAA-NGP) which leverages accelerated sampling and hash encoding to expedite automatic pose refinement/estimation and 3D scene reconstruction. Experimental results demonstrate 10 to 20 x speed improvement compared to other bundle-adjusting neural radiance field methods without sacrificing the quality of pose estimation. The github repository can be found here https://github.com/IntelLabs/baa-ngp.
Paper Structure (19 sections, 8 equations, 6 figures, 2 tables)

This paper contains 19 sections, 8 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: BAA-NGP is a neural implicit representation that captures 3D scenes from 2D images with unknown camera poses. It learns the 3D scene together with the camera poses within minutes of training, whereas previous methods would have taken hours.
  • Figure 2: INRs use posed images from multiple viewpoints to reconstruct the scene. In our problem, we assume that a sequence of images was taken from unknown viewpoints for unbounded scenes (left) and poorly estimated viewpoints for bounded scenes (right). Purple frames are initial camera poses, gray/blue frames are ground truth camera poses, and the red line indicates a translation error.
  • Figure 3: Reparameterization of 3D space via spherical contraction. Each point $p=(x,y,z)$ outside unit sphere becomes $(x',y',z',1/h)$, a quadruple that converts unbounded distances $h$ to bounded distances $1/h$.
  • Figure 4: Qualitative analysis of BAA-NGP on the blender synthetic dataset. BAA-NGP produces better quality in image synthesis with cleaner backgrounds and finer details than BARF with 10 $\times$ less time.
  • Figure 5: Qualitative analysis of BAA-NGP on the LLFF dataset. We show that our results are on par with BARF's results but converge 20 $\times$ faster.
  • ...and 1 more figures