Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation
Mathieu Labbé, François Michaud
TL;DR
The paper tackles online loop closure detection in large-scale, long-term appearance-based mapping by introducing RTAB-Map, a memory-management framework that bounds the Working Memory (WM) to maintain real-time processing and uses a Long-Term Memory (LTM) to store overflow locations. It combines a Bayesian framework with Retrieval and Transfer mechanisms: new observations are converted into incremental bag-of-words signatures from SURF features, weights are updated, and loop-closure hypotheses are tracked with a discrete Bayesian filter; matching is performed against WM, while nearby or retrieved locations from LTM are brought into WM to support future closures. Key contributions include the memory-management strategy (Weight Update, Retrieval, Transfer), the integrated Bayesian loop-closure hypothesis tracking, and extensive empirical validation across community, university campus, and synthetic datasets showing high recall at 100% precision while satisfying real-time constraints. The approach demonstrates that online, large-scale SLAM is feasible by selectively retaining discriminative locations in WM and opportunistically leveraging LTM via retrieval, enabling robust loop closures under varying illumination and viewpoint changes with practical processing times.
Abstract
In appearance-based localization and mapping, loop closure detection is the process used to determinate if the current observation comes from a previously visited location or a new one. As the size of the internal map increases, so does the time required to compare new observations with all stored locations, eventually limiting online processing. This paper presents an online loop closure detection approach for large-scale and long-term operation. The approach is based on a memory management method, which limits the number of locations used for loop closure detection so that the computation time remains under real-time constraints. The idea consists of keeping the most recent and frequently observed locations in a Working Memory (WM) used for loop closure detection, and transferring the others into a Long-Term Memory (LTM). When a match is found between the current location and one stored in WM, associated locations stored in LTM can be updated and remembered for additional loop closure detections. Results demonstrate the approach's adaptability and scalability using ten standard data sets from other appearance-based loop closure approaches, one custom data set using real images taken over a 2 km loop of our university campus, and one custom data set (7 hours) using virtual images from the racing video game ``Need for Speed: Most Wanted''.
