Table of Contents
Fetching ...

Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation

Mathieu Labbé, François Michaud

TL;DR

The paper tackles online loop closure detection in large-scale, long-term appearance-based mapping by introducing RTAB-Map, a memory-management framework that bounds the Working Memory (WM) to maintain real-time processing and uses a Long-Term Memory (LTM) to store overflow locations. It combines a Bayesian framework with Retrieval and Transfer mechanisms: new observations are converted into incremental bag-of-words signatures from SURF features, weights are updated, and loop-closure hypotheses are tracked with a discrete Bayesian filter; matching is performed against WM, while nearby or retrieved locations from LTM are brought into WM to support future closures. Key contributions include the memory-management strategy (Weight Update, Retrieval, Transfer), the integrated Bayesian loop-closure hypothesis tracking, and extensive empirical validation across community, university campus, and synthetic datasets showing high recall at 100% precision while satisfying real-time constraints. The approach demonstrates that online, large-scale SLAM is feasible by selectively retaining discriminative locations in WM and opportunistically leveraging LTM via retrieval, enabling robust loop closures under varying illumination and viewpoint changes with practical processing times.

Abstract

In appearance-based localization and mapping, loop closure detection is the process used to determinate if the current observation comes from a previously visited location or a new one. As the size of the internal map increases, so does the time required to compare new observations with all stored locations, eventually limiting online processing. This paper presents an online loop closure detection approach for large-scale and long-term operation. The approach is based on a memory management method, which limits the number of locations used for loop closure detection so that the computation time remains under real-time constraints. The idea consists of keeping the most recent and frequently observed locations in a Working Memory (WM) used for loop closure detection, and transferring the others into a Long-Term Memory (LTM). When a match is found between the current location and one stored in WM, associated locations stored in LTM can be updated and remembered for additional loop closure detections. Results demonstrate the approach's adaptability and scalability using ten standard data sets from other appearance-based loop closure approaches, one custom data set using real images taken over a 2 km loop of our university campus, and one custom data set (7 hours) using virtual images from the racing video game ``Need for Speed: Most Wanted''.

Appearance-Based Loop Closure Detection for Online Large-Scale and Long-Term Operation

TL;DR

The paper tackles online loop closure detection in large-scale, long-term appearance-based mapping by introducing RTAB-Map, a memory-management framework that bounds the Working Memory (WM) to maintain real-time processing and uses a Long-Term Memory (LTM) to store overflow locations. It combines a Bayesian framework with Retrieval and Transfer mechanisms: new observations are converted into incremental bag-of-words signatures from SURF features, weights are updated, and loop-closure hypotheses are tracked with a discrete Bayesian filter; matching is performed against WM, while nearby or retrieved locations from LTM are brought into WM to support future closures. Key contributions include the memory-management strategy (Weight Update, Retrieval, Transfer), the integrated Bayesian loop-closure hypothesis tracking, and extensive empirical validation across community, university campus, and synthetic datasets showing high recall at 100% precision while satisfying real-time constraints. The approach demonstrates that online, large-scale SLAM is feasible by selectively retaining discriminative locations in WM and opportunistically leveraging LTM via retrieval, enabling robust loop closures under varying illumination and viewpoint changes with practical processing times.

Abstract

In appearance-based localization and mapping, loop closure detection is the process used to determinate if the current observation comes from a previously visited location or a new one. As the size of the internal map increases, so does the time required to compare new observations with all stored locations, eventually limiting online processing. This paper presents an online loop closure detection approach for large-scale and long-term operation. The approach is based on a memory management method, which limits the number of locations used for loop closure detection so that the computation time remains under real-time constraints. The idea consists of keeping the most recent and frequently observed locations in a Working Memory (WM) used for loop closure detection, and transferring the others into a Long-Term Memory (LTM). When a match is found between the current location and one stored in WM, associated locations stored in LTM can be updated and remembered for additional loop closure detections. Results demonstrate the approach's adaptability and scalability using ten standard data sets from other appearance-based loop closure approaches, one custom data set using real images taken over a 2 km loop of our university campus, and one custom data set (7 hours) using virtual images from the racing video game ``Need for Speed: Most Wanted''.
Paper Structure (15 sections, 4 equations, 11 figures, 3 tables, 4 algorithms)

This paper contains 15 sections, 4 equations, 11 figures, 3 tables, 4 algorithms.

Figures (11)

  • Figure 1: Graph representation of locations. Vertical arrows are loop closure links and horizontal arrows are neighbor links. Dotted links show not detected loop closures. Black locations are those in LTM, white ones are in WM and gray ones are in STM. Node 455 is the current acquired location.
  • Figure 2: RTAB-Map memory management model.
  • Figure 3: Database representation of the LTM.
  • Figure 4: Precision-recall curves for each data set.
  • Figure 5: UdeS data set aerial view. The first traversal is represented by the dotted line. The second traversal is represented by a line located around the first. The start/end point is represented by the circle. The small white dots in the waypoint ID numbers represent camera orientation at this location. Recall performance is from the test case with $T_{\mathrm{time}}=0.7 \, \mathrm{s}$.
  • ...and 6 more figures