Table of Contents
Fetching ...

Rendering Stable Features Improves Sampling-Based Localisation with Neural Radiance Fields

Boxuan Zhang, Lindsay Kleeman, Michael Burke

TL;DR

A systematic empirical comparison of sampling-based localisation using NeRFs shows that in contrast to conventional feature matching approaches for geometry-based localisation, sampling-based localisation using NeRFs benefits significantly from stable features.

Abstract

Neural radiance fields (NeRFs) are a powerful tool for implicit scene representations, allowing for differentiable rendering and the ability to make predictions about unseen viewpoints. There has been growing interest in object and scene-based localisation using NeRFs, with a number of recent works relying on sampling-based or Monte-Carlo localisation schemes. Unfortunately, these can be extremely computationally expensive, requiring multiple network forward passes to infer camera or object pose. To alleviate this, a variety of sampling strategies have been applied, many relying on keypoint recognition techniques from classical computer vision. This work conducts a systematic empirical comparison of these approaches and shows that in contrast to conventional feature matching approaches for geometry-based localisation, sampling-based localisation using NeRFs benefits significantly from stable features. Results show that rendering stable features provides significantly better estimation with a tenfold reduction in the number of forward passes required.

Rendering Stable Features Improves Sampling-Based Localisation with Neural Radiance Fields

TL;DR

A systematic empirical comparison of sampling-based localisation using NeRFs shows that in contrast to conventional feature matching approaches for geometry-based localisation, sampling-based localisation using NeRFs benefits significantly from stable features.

Abstract

Neural radiance fields (NeRFs) are a powerful tool for implicit scene representations, allowing for differentiable rendering and the ability to make predictions about unseen viewpoints. There has been growing interest in object and scene-based localisation using NeRFs, with a number of recent works relying on sampling-based or Monte-Carlo localisation schemes. Unfortunately, these can be extremely computationally expensive, requiring multiple network forward passes to infer camera or object pose. To alleviate this, a variety of sampling strategies have been applied, many relying on keypoint recognition techniques from classical computer vision. This work conducts a systematic empirical comparison of these approaches and shows that in contrast to conventional feature matching approaches for geometry-based localisation, sampling-based localisation using NeRFs benefits significantly from stable features. Results show that rendering stable features provides significantly better estimation with a tenfold reduction in the number of forward passes required.
Paper Structure (13 sections, 4 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 13 sections, 4 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Left: Dataset samples and feature locations. Right: Likelihoods for pose rotation (x) show sharper cut-offs for corner detection schemes. Perturbation along other axes shows similar behaviour. Our hypothesis is that this hinders sampling-based pose estimation scheme convergence.
  • Figure 2: Final pose estimation errors when varying rendered pixels and pose samples (16 points rendered along NeRF rays).
  • Figure 3: Final pose estimation errors when varying rendered pixels and pose samples (64 points rendered along NeRF rays).