OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

Qiuchi Xiang; Jintao Cheng; Jiehao Luo; Jin Wu; Rui Fan; Xieyuanli Chen; Xiaoyu Tang

OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

Qiuchi Xiang, Jintao Cheng, Jiehao Luo, Jin Wu, Rui Fan, Xieyuanli Chen, Xiaoyu Tang

TL;DR

OverlapMamba tackles robust, real-time LiDAR-based place recognition for SLAM by converting range views into directional sequences and applying a shift-state-space model with a random yaw reconstruction. The architecture combines an OverlapMamba backbone, a multi-directional OverlapMamba block, and a NetVLAD-based Global Descriptor Generator to produce yaw-invariant global descriptors from raw RVs. An ImTrihard triplet loss further enhances convergence and generalization. Across KITTI, Ford Campus, and NCLT, it achieves state-of-the-art loop closure and place recognition with significantly lower runtime than transformer-based approaches, demonstrating impactful real-time localization capabilities for autonomous systems.

Abstract

Place recognition is the foundation for enabling autonomous systems to achieve independent decision-making and safe operations. It is also crucial in tasks such as loop closure detection and global localization within SLAM. Previous methods utilize mundane point cloud representations as input and deep learning-based LiDAR-based Place Recognition (LPR) approaches employing different point cloud image inputs with convolutional neural networks (CNNs) or transformer architectures. However, the recently proposed Mamba deep learning model, combined with state space models (SSMs), holds great potential for long sequence modeling. Therefore, we developed OverlapMamba, a novel network for place recognition, which represents input range views (RVs) as sequences. In a novel way, we employ a stochastic reconstruction approach to build shift state space models, compressing the visual representation. Evaluated on three different public datasets, our method effectively detects loop closures, showing robustness even when traversing previously visited locations from different directions. Relying on raw range view inputs, it outperforms typical LiDAR and multi-view combination methods in time complexity and speed, indicating strong place recognition capabilities and real-time efficiency.

OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

TL;DR

Abstract

Paper Structure (18 sections, 8 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 8 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Related work
LPR Based on Local Description
LPR Based on Global Description
Overview of the Framework
Preliminaries
Mamba-Based Place Recognition
OverlapMamba block
Sequential Pyramid Pooling in the Backbone
Improved Triplet Loss with Hard Mining
Experiments
Experimental Setup
Evaluation for Loop Closure Detection
Evaluation for Place Recognition
Ablation Study on Mamba Modules
...and 3 more sections

Figures (8)

Figure 1: Core idea of the proposed OverlapMamba model. The left parts represent RV projection and 1D point cloud serialization. The right parts represent the overview of our novel state space models for place recognition.
Figure 2: Overview of the proposed OverlapMamba. Assuming a batch size of 1, the overlap backbone compresses the RVs from the LiDAR sensor information into yaw-equivariant feature sequences. The OverlapMamba block connects the feature sequences from the backbone with the multidirectionally enhanced feature sequences processed by the SSM. The global descriptor generator (GDG) utilizes a combination of multilayer peceptron (MLP) and NetVLAD to generate a one-dimensional global descriptor.
Figure 3: The SIFT operation process. The left part shows an example of an RV containing omnidirectional feature information. In the right part of the figure, we demonstrate the process of randomly reconstructing the feature sequence modeled along the vertical direction for the yaw angle, where $a$ is a random parameter used to calculate the starting index of the reconstructed sequence.
Figure 4: The structure of the SPP block. Three consecutive 1D pooling operations are performed in the block, and the intermediate states are added together to obtain the output. A simple example is used to visualize the processing of the sequence, demonstrating that the SPP preserves yaw information while enriching the originally discontinuous spatial information.
Figure 5: Original loss and F1max during training. The model does not learn enough generalized information as the loss converges rapidly. The figure shows that the model overfit before the loss fully converged.
...and 3 more figures

OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

TL;DR

Abstract

OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (8)