Table of Contents
Fetching ...

Matching Query Image Against Selected NeRF Feature for Efficient and Scalable Localization

Huaiji Zhou, Bing Wang, Changhao Chen

TL;DR

This work tackles the efficiency and scalability challenges of NeRF-based visual localization in large-scale environments by introducing MatLoc-NeRF, a matching-based framework that operates on selected intermediate NeRF features. A learnable feature selector minimizes the number of features used for 2D-3D matching, while a pose-aware scene partitioning strategy reduces the number of sub-NeRFs needed for a given pose, enabling fast feature generation and robust localization. The system also includes automatic coarse pose estimation via a two-stage clustering plus ArcFace-based place predictor to provide reliable initializations for subsequent refinement. Evaluations on public large-scale datasets demonstrate superior efficiency and accuracy against existing NeRF-based localization methods, highlighting the practical viability of implicit neural maps for scalable localization. Overall, MatLoc-NeRF advances real-time capable localization in city-scale scenes by eliminating reliance on extra descriptors and reducing computational overhead through learned feature selection and structured scene partitioning.

Abstract

Neural implicit representations such as NeRF have revolutionized 3D scene representation with photo-realistic quality. However, existing methods for visual localization within NeRF representations suffer from inefficiency and scalability issues, particularly in large-scale environments. This work proposes MatLoc-NeRF, a novel matching-based localization framework using selected NeRF features. It addresses efficiency by employing a learnable feature selection mechanism that identifies informative NeRF features for matching with query images. This eliminates the need for all NeRF features or additional descriptors, leading to faster and more accurate pose estimation. To tackle large-scale scenes, MatLoc-NeRF utilizes a pose-aware scene partitioning strategy. It ensures that only the most relevant NeRF sub-block generates key features for a specific pose. Additionally, scene segmentation and a place predictor provide fast coarse initial pose estimation. Evaluations on public large-scale datasets demonstrate that MatLoc-NeRF achieves superior efficiency and accuracy compared to existing NeRF-based localization methods.

Matching Query Image Against Selected NeRF Feature for Efficient and Scalable Localization

TL;DR

This work tackles the efficiency and scalability challenges of NeRF-based visual localization in large-scale environments by introducing MatLoc-NeRF, a matching-based framework that operates on selected intermediate NeRF features. A learnable feature selector minimizes the number of features used for 2D-3D matching, while a pose-aware scene partitioning strategy reduces the number of sub-NeRFs needed for a given pose, enabling fast feature generation and robust localization. The system also includes automatic coarse pose estimation via a two-stage clustering plus ArcFace-based place predictor to provide reliable initializations for subsequent refinement. Evaluations on public large-scale datasets demonstrate superior efficiency and accuracy against existing NeRF-based localization methods, highlighting the practical viability of implicit neural maps for scalable localization. Overall, MatLoc-NeRF advances real-time capable localization in city-scale scenes by eliminating reliance on extra descriptors and reducing computational overhead through learned feature selection and structured scene partitioning.

Abstract

Neural implicit representations such as NeRF have revolutionized 3D scene representation with photo-realistic quality. However, existing methods for visual localization within NeRF representations suffer from inefficiency and scalability issues, particularly in large-scale environments. This work proposes MatLoc-NeRF, a novel matching-based localization framework using selected NeRF features. It addresses efficiency by employing a learnable feature selection mechanism that identifies informative NeRF features for matching with query images. This eliminates the need for all NeRF features or additional descriptors, leading to faster and more accurate pose estimation. To tackle large-scale scenes, MatLoc-NeRF utilizes a pose-aware scene partitioning strategy. It ensures that only the most relevant NeRF sub-block generates key features for a specific pose. Additionally, scene segmentation and a place predictor provide fast coarse initial pose estimation. Evaluations on public large-scale datasets demonstrate that MatLoc-NeRF achieves superior efficiency and accuracy compared to existing NeRF-based localization methods.
Paper Structure (15 sections, 7 equations, 2 figures, 6 tables)

This paper contains 15 sections, 7 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Our MatLoc-NeRF, a novel matching-based NeRF localization framework learns to select the useful NeRF features for 2D-3D matching with the query image, efficient and scalable localization is achieved by solving the PnP problem with these matches.
  • Figure 2: Our coarse pose estimation module uses 2-stage K-Means cluttering to identify candidate locations within the large-scale scene. a lightweight CNN is trained to predict the scene location corresponding to the query image, providing an initial pose for NeRF-based matching.