PlaceNav: Topological Navigation through Place Recognition
Lauri Suomela, Jussi Kalliola, Harry Edelman, Joni-Kristian Kämäräinen
TL;DR
This work tackles data scarcity and computational scaling in learning-based topological navigation by recasting subgoal selection as a visual place-recognition task. PlaceNav decouples robot-independent subgoal selection from robot-specific policies, enabling training on large generic datasets and fast retrieval, while a discrete Bayesian filter enforces temporal consistency. Empirical results show substantial gains in indoor and outdoor navigation performance and notable runtime reductions compared to temporal-distance baselines, confirming the value of leveraging generic visual data for robotics. The findings suggest that incorporating generic data for subgoal selection can improve robustness and scalability in robotic navigation, with future work focused on appearance-invariant recognition and more robust goal-reaching strategies.
Abstract
Recent results suggest that splitting topological navigation into robot-independent and robot-specific components improves navigation performance by enabling the robot-independent part to be trained with data collected by robots of different types. However, the navigation methods' performance is still limited by the scarcity of suitable training data and they suffer from poor computational scaling. In this work, we present PlaceNav, subdividing the robot-independent part into navigation-specific and generic computer vision components. We utilize visual place recognition for the subgoal selection of the topological navigation pipeline. This makes subgoal selection more efficient and enables leveraging large-scale datasets from non-robotics sources, increasing training data availability. Bayesian filtering, enabled by place recognition, further improves navigation performance by increasing the temporal consistency of subgoals. Our experimental results verify the design and the new method obtains a 76% higher success rate in indoor and 23% higher in outdoor navigation tasks with higher computational efficiency.
