Table of Contents
Fetching ...

Local positional graphs and attentive local features for a data and runtime-efficient hierarchical place recognition pipeline

Fangming Yuan, Stefan Schubert, Peter Protzel, Peer Neubert

TL;DR

A runtime and data-efficient hierarchical VPR pipeline that extends existing approaches and presents novel ideas and exploits hyperdimensional computing to use the same local features as holistic HDC-descriptors for fast candidate selection and for candidate reranking is proposed.

Abstract

Large-scale applications of Visual Place Recognition (VPR) require computationally efficient approaches. Further, a well-balanced combination of data-based and training-free approaches can decrease the required amount of training data and effort and can reduce the influence of distribution shifts between the training and application phases. This paper proposes a runtime and data-efficient hierarchical VPR pipeline that extends existing approaches and presents novel ideas. There are three main contributions: First, we propose Local Positional Graphs (LPG), a training-free and runtime-efficient approach to encode spatial context information of local image features. LPG can be combined with existing local feature detectors and descriptors and considerably improves the image-matching quality compared to existing techniques in our experiments. Second, we present Attentive Local SPED (ATLAS), an extension of our previous local features approach with an attention module that improves the feature quality while maintaining high data efficiency. The influence of the proposed modifications is evaluated in an extensive ablation study. Third, we present a hierarchical pipeline that exploits hyperdimensional computing to use the same local features as holistic HDC-descriptors for fast candidate selection and for candidate reranking. We combine all contributions in a runtime and data-efficient VPR pipeline that shows benefits over the state-of-the-art method Patch-NetVLAD on a large collection of standard place recognition datasets with 15$\%$ better performance in VPR accuracy, 54$\times$ faster feature comparison speed, and 55$\times$ less descriptor storage occupancy, making our method promising for real-world high-performance large-scale VPR in changing environments. Code will be made available with publication of this paper.

Local positional graphs and attentive local features for a data and runtime-efficient hierarchical place recognition pipeline

TL;DR

A runtime and data-efficient hierarchical VPR pipeline that extends existing approaches and presents novel ideas and exploits hyperdimensional computing to use the same local features as holistic HDC-descriptors for fast candidate selection and for candidate reranking is proposed.

Abstract

Large-scale applications of Visual Place Recognition (VPR) require computationally efficient approaches. Further, a well-balanced combination of data-based and training-free approaches can decrease the required amount of training data and effort and can reduce the influence of distribution shifts between the training and application phases. This paper proposes a runtime and data-efficient hierarchical VPR pipeline that extends existing approaches and presents novel ideas. There are three main contributions: First, we propose Local Positional Graphs (LPG), a training-free and runtime-efficient approach to encode spatial context information of local image features. LPG can be combined with existing local feature detectors and descriptors and considerably improves the image-matching quality compared to existing techniques in our experiments. Second, we present Attentive Local SPED (ATLAS), an extension of our previous local features approach with an attention module that improves the feature quality while maintaining high data efficiency. The influence of the proposed modifications is evaluated in an extensive ablation study. Third, we present a hierarchical pipeline that exploits hyperdimensional computing to use the same local features as holistic HDC-descriptors for fast candidate selection and for candidate reranking. We combine all contributions in a runtime and data-efficient VPR pipeline that shows benefits over the state-of-the-art method Patch-NetVLAD on a large collection of standard place recognition datasets with 15 better performance in VPR accuracy, 54 faster feature comparison speed, and 55 less descriptor storage occupancy, making our method promising for real-world high-performance large-scale VPR in changing environments. Code will be made available with publication of this paper.
Paper Structure (20 sections, 8 equations, 4 figures, 6 tables)

This paper contains 20 sections, 8 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: An overview of the hierarchical VPR approach proposed in this paper: The pipeline first extracts local ATLAS image descriptors aggregated into a holistic image descriptor with hyperdimensional computing (HDC). These descriptors efficiently retrieve the top-K matching candidates before a final re-ranking with the local descriptors and the proposed Local Positional Graphs.
  • Figure 2: An overview of the proposed hierarchical pipeline. The green dashed line covered area contains the ATLAS local feature pipeline components. In contrast, the orange dashed line covered area contains the components to extend ATLAS to Hir-ATLAS. Note that the top right yellow block is only used for ATLAS training, whereas the bottom part is exclusively used for inference, i.e., the actual application for VPR.
  • Figure 3: Visualization of the Local Positional Graph (LPG). The crosses represent the local features in the images. (a) Creating a star-shaped graph for each local feature in the database image: Two graphs $G_0^{db}$ and $G_1^{db}$ are created for two local features. (b) Creation of corresponding graphs in the query image: The local features in the query image that are mutually matched to the root nodes in $G_0^{db}$ and $G_1^{db}$ serve as root nodes in the graphs $G_0^q$ and $G_1^q$. Features in the query image that are mutually matched with the leaf nodes of $G_0^{db}$ and $G_1^{db}$ are leaf nodes in $G_0^q$ and $G_1^q$. Unmatched leaves are discarded for $G_0^{db}$ (bottom left node in $G_0^{db}$). (c) Graph comparison: The node positions in each graph are translated with the position of their root nodes to overlay the corresponding graphs. The displacement vectors $\delta_k$ are determined for all corresponding leaf nodes.
  • Figure 4: LPG graphs with correctly matched root nodes (a) and incorrectly matched rood nodes (b) using local ATLAS features between database images (left) and query images (right). Red circles show the root nodes. Green circles show leaf nodes with a radius that corresponds to the pixel size in the attention map. Red lines connect matched leaf nodes.