Re-localization acceleration with Medoid Silhouette Clustering
Hongyi Zhang, Walterio Mayol-Cuevas
TL;DR
The paper tackles the bottleneck of re-localization speed in visual place recognition by introducing a keyframe-driven, tree-like search that leverages Faster Medoid Silhouette Clustering to select informative keyframes. By prioritizing medoid keyframes near cluster centers and refining matches within localized regions, the approach achieves substantial time savings while maintaining accuracy across multiple public datasets. A formal criterion, the Average Medoid Silhouette, guides keyframe quality and reliability of re-localization, enabling adaptive decision-making on when to rely on keyframes. Results on Nordland, Gardens Point Walking, and Oxford Radar RobotCar demonstrate 50–90% reductions in query time with minimal or no loss in localization accuracy, highlighting practical potential for embedded systems and real-time operation.
Abstract
Two crucial performance criteria for the deployment of visual localization are speed and accuracy. Current research on visual localization with neural networks is limited to examining methods for enhancing the accuracy of networks across various datasets. How to expedite the re-localization process within deep neural network architectures still needs further investigation. In this paper, we present a novel approach for accelerating visual re-localization in practice. A tree-like search strategy, built on the keyframes extracted by a visual clustering algorithm, is designed for matching acceleration. Our method has been validated on two tasks across three public datasets, allowing for 50 up to 90 percent time saving over the baseline while not reducing location accuracy.
