Table of Contents
Fetching ...

Robust Loop Closure by Textual Cues in Challenging Environments

Tongxing Jin, Thien-Minh Nguyen, Xinhang Xu, Yizhuo Yang, Shenghai Yuan, Jianping Li, Lihua Xie

TL;DR

This work proposes a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments that has superior performance over existing methods that rely solely on visual and LiDAR sensors.

Abstract

Loop closure is an important task in robot navigation. However, existing methods mostly rely on some implicit or heuristic features of the environment, which can still fail to work in common environments such as corridors, tunnels, and warehouses. Indeed, navigating in such featureless, degenerative, and repetitive (FDR) environments would also pose a significant challenge even for humans, but explicit text cues in the surroundings often provide the best assistance. This inspires us to propose a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments. Specifically, our approach first extracts scene text entities based on Optical Character Recognition (OCR), then creates a local map of text cues based on accurate LiDAR odometry and finally identifies loop closure events by a graph-theoretic scheme. Experiment results demonstrate that this approach has superior performance over existing methods that rely solely on visual and LiDAR sensors. To benefit the community, we release the source code and datasets at \url{https://github.com/TongxingJin/TXTLCD}.

Robust Loop Closure by Textual Cues in Challenging Environments

TL;DR

This work proposes a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments that has superior performance over existing methods that rely solely on visual and LiDAR sensors.

Abstract

Loop closure is an important task in robot navigation. However, existing methods mostly rely on some implicit or heuristic features of the environment, which can still fail to work in common environments such as corridors, tunnels, and warehouses. Indeed, navigating in such featureless, degenerative, and repetitive (FDR) environments would also pose a significant challenge even for humans, but explicit text cues in the surroundings often provide the best assistance. This inspires us to propose a multi-modal loop closure method based on explicit human-readable textual cues in FDR environments. Specifically, our approach first extracts scene text entities based on Optical Character Recognition (OCR), then creates a local map of text cues based on accurate LiDAR odometry and finally identifies loop closure events by a graph-theoretic scheme. Experiment results demonstrate that this approach has superior performance over existing methods that rely solely on visual and LiDAR sensors. To benefit the community, we release the source code and datasets at \url{https://github.com/TongxingJin/TXTLCD}.

Paper Structure

This paper contains 20 sections, 10 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Examples of common FDR scenes, where humans naturally navigate using readable textual signs and their spatial arrangements. This inspires us to use text cue for global location understanding.
  • Figure 2: Pipeline of our text cue-based loop closure. Camera and LiDAR data are fused to estimate text entity poses and create local text entity maps that encode the specific arrangement of the scene texts. A novel graph-theoretic scheme is applied to verify the authenticity of candidate loop closures retrieved from the online database, and pose graph optimization is performed whenever a new loop is closed to mitigate cumulative odometry drift and ensure the consistency of global maps.
  • Figure 3: Illustrations of Text Entity Representation.
  • Figure 4: Putative associations between two LTEMs. LTEM $\mathcal{M}_c$ and $\mathcal{M}_p$ contain a set of text entities observed by the continuous LiDAR poses $\mathcal{T}_c$ (green trajectory) and $\mathcal{T}_p$ (blue trajectory) respectively. Putative associations are denoted by balls of the same color connected by purple lines, while dashed lines indicate false associations.
  • Figure 5: Consistency graph. The darkness of the lines signifies the geometrical consistency between the connected two nodes (putative associations).
  • ...and 3 more figures