Table of Contents
Fetching ...

Re-localization acceleration with Medoid Silhouette Clustering

Hongyi Zhang, Walterio Mayol-Cuevas

TL;DR

The paper tackles the bottleneck of re-localization speed in visual place recognition by introducing a keyframe-driven, tree-like search that leverages Faster Medoid Silhouette Clustering to select informative keyframes. By prioritizing medoid keyframes near cluster centers and refining matches within localized regions, the approach achieves substantial time savings while maintaining accuracy across multiple public datasets. A formal criterion, the Average Medoid Silhouette, guides keyframe quality and reliability of re-localization, enabling adaptive decision-making on when to rely on keyframes. Results on Nordland, Gardens Point Walking, and Oxford Radar RobotCar demonstrate 50–90% reductions in query time with minimal or no loss in localization accuracy, highlighting practical potential for embedded systems and real-time operation.

Abstract

Two crucial performance criteria for the deployment of visual localization are speed and accuracy. Current research on visual localization with neural networks is limited to examining methods for enhancing the accuracy of networks across various datasets. How to expedite the re-localization process within deep neural network architectures still needs further investigation. In this paper, we present a novel approach for accelerating visual re-localization in practice. A tree-like search strategy, built on the keyframes extracted by a visual clustering algorithm, is designed for matching acceleration. Our method has been validated on two tasks across three public datasets, allowing for 50 up to 90 percent time saving over the baseline while not reducing location accuracy.

Re-localization acceleration with Medoid Silhouette Clustering

TL;DR

The paper tackles the bottleneck of re-localization speed in visual place recognition by introducing a keyframe-driven, tree-like search that leverages Faster Medoid Silhouette Clustering to select informative keyframes. By prioritizing medoid keyframes near cluster centers and refining matches within localized regions, the approach achieves substantial time savings while maintaining accuracy across multiple public datasets. A formal criterion, the Average Medoid Silhouette, guides keyframe quality and reliability of re-localization, enabling adaptive decision-making on when to rely on keyframes. Results on Nordland, Gardens Point Walking, and Oxford Radar RobotCar demonstrate 50–90% reductions in query time with minimal or no loss in localization accuracy, highlighting practical potential for embedded systems and real-time operation.

Abstract

Two crucial performance criteria for the deployment of visual localization are speed and accuracy. Current research on visual localization with neural networks is limited to examining methods for enhancing the accuracy of networks across various datasets. How to expedite the re-localization process within deep neural network architectures still needs further investigation. In this paper, we present a novel approach for accelerating visual re-localization in practice. A tree-like search strategy, built on the keyframes extracted by a visual clustering algorithm, is designed for matching acceleration. Our method has been validated on two tasks across three public datasets, allowing for 50 up to 90 percent time saving over the baseline while not reducing location accuracy.
Paper Structure (17 sections, 2 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 17 sections, 2 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Illustration of conventional re-localization and structured search via medioid keyframes
  • Figure 2: Visualization of re-localization with Keyframes: keyframes extraction, Match query with keyframes, further neighboring area for accurate position.
  • Figure 3: Computational time of im2im and seq2seq task
  • Figure 4: Medoid Silhouette and Accuracy of tests on three datasets
  • Figure 5: Accuracy of two tasks with different methods to initializing the keyframes
  • ...and 1 more figures