Table of Contents
Fetching ...

LiDAR Loop Closure Detection using Semantic Graphs with Graph Attention Networks

Liudi Yang, Ruben Mascaro, Ignacio Alzugaray, Sai Manoj Prakhya, Marco Karrer, Ziyuan Liu, Margarita Chli

TL;DR

This work tackles LiDAR SLAM drift by introducing a semantic-graph loop-closure framework powered by Graph Attention Networks (GATs). It encodes semantic graphs from segmented point clouds into distinctive graph vectors via a three-branch GAT and a self-attention–based graph encoder, then compares graphs with a difference-driven neural module to detect loop closures. A semantic registration stage estimates the 6 DoF pose constraint to be added to the pose graph, improving trajectory consistency. On SemanticKITTI and KITTI-360, the method yields up to a 13% improvement in maximum $F_1$ score over the SGPR baseline and demonstrates real-time performance with a compact model, while open-sourcing the implementation to facilitate further research.

Abstract

In this paper, we propose a novel loop closure detection algorithm that uses graph attention neural networks to encode semantic graphs to perform place recognition and then use semantic registration to estimate the 6 DoF relative pose constraint. Our place recognition algorithm has two key modules, namely, a semantic graph encoder module and a graph comparison module. The semantic graph encoder employs graph attention networks to efficiently encode spatial, semantic and geometric information from the semantic graph of the input point cloud. We then use self-attention mechanism in both node-embedding and graph-embedding steps to create distinctive graph vectors. The graph vectors of the current scan and a keyframe scan are then compared in the graph comparison module to identify a possible loop closure. Specifically, employing the difference of the two graph vectors showed a significant improvement in performance, as shown in ablation studies. Lastly, we implemented a semantic registration algorithm that takes in loop closure candidate scans and estimates the relative 6 DoF pose constraint for the LiDAR SLAM system. Extensive evaluation on public datasets shows that our model is more accurate and robust, achieving 13% improvement in maximum F1 score on the SemanticKITTI dataset, when compared to the baseline semantic graph algorithm. For the benefit of the community, we open-source the complete implementation of our proposed algorithm and custom implementation of semantic registration at https://github.com/crepuscularlight/SemanticLoopClosure

LiDAR Loop Closure Detection using Semantic Graphs with Graph Attention Networks

TL;DR

This work tackles LiDAR SLAM drift by introducing a semantic-graph loop-closure framework powered by Graph Attention Networks (GATs). It encodes semantic graphs from segmented point clouds into distinctive graph vectors via a three-branch GAT and a self-attention–based graph encoder, then compares graphs with a difference-driven neural module to detect loop closures. A semantic registration stage estimates the 6 DoF pose constraint to be added to the pose graph, improving trajectory consistency. On SemanticKITTI and KITTI-360, the method yields up to a 13% improvement in maximum score over the SGPR baseline and demonstrates real-time performance with a compact model, while open-sourcing the implementation to facilitate further research.

Abstract

In this paper, we propose a novel loop closure detection algorithm that uses graph attention neural networks to encode semantic graphs to perform place recognition and then use semantic registration to estimate the 6 DoF relative pose constraint. Our place recognition algorithm has two key modules, namely, a semantic graph encoder module and a graph comparison module. The semantic graph encoder employs graph attention networks to efficiently encode spatial, semantic and geometric information from the semantic graph of the input point cloud. We then use self-attention mechanism in both node-embedding and graph-embedding steps to create distinctive graph vectors. The graph vectors of the current scan and a keyframe scan are then compared in the graph comparison module to identify a possible loop closure. Specifically, employing the difference of the two graph vectors showed a significant improvement in performance, as shown in ablation studies. Lastly, we implemented a semantic registration algorithm that takes in loop closure candidate scans and estimates the relative 6 DoF pose constraint for the LiDAR SLAM system. Extensive evaluation on public datasets shows that our model is more accurate and robust, achieving 13% improvement in maximum F1 score on the SemanticKITTI dataset, when compared to the baseline semantic graph algorithm. For the benefit of the community, we open-source the complete implementation of our proposed algorithm and custom implementation of semantic registration at https://github.com/crepuscularlight/SemanticLoopClosure

Paper Structure

This paper contains 35 sections, 12 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: The high-level workflow of the proposed semantic graph based loop closure system integrated into a SLAM framework. The proposed loop closure algorithm takes two semantically segmented point clouds as input, which are converted to semantic graphs. After that, semantic graph encoders are deployed to compress them into graph vectors. Finally, the graph comparison module predicts the similarity of the two loop candidates. When the similarity exceeds a specific threshold, a pose constraint is estimated using semantic registration, which is added to the pose graph for trajectory optimization.
  • Figure 2: The architecture of the proposed semantic graph encoder. The semantic graphs created from input point clouds are passed through three GATs to extract contextual spatial, semantic and geometric features. These features are then concatenated and passed through a self-attention module to produce a node embedding $f$. Another self-attention module operated on the node embedding $f$ to learn a global context vector $c$. We finally project the node embedding $f$ into the global context vector $c$ to obtain corresponding node weights and use them to calculate the final graph vector $e$.
  • Figure 3: Overview of the graph comparison module. We propose a relative difference vector (shown in orange color) as the absolute value of the difference between two graph vectors. The similarity vector is leart based from first-order and second-order difference vectors, and concatenated graph vectors. This similarity vector is then passed through fully connected layers to predict the similarity value between two input graph vectors.
  • Figure 4: Precision-Recall curves of max F1 score on SemanticKITTI dataset. In the legend, AUC denotes the area under curve. Here we compare, Ours and Ours-RN with SGPR SGPR, SGPR-RN, ScanContext (SC) ScanContext and ISC ISC. It can be seen that Ours and Ours-RN outperform other methods on all sequences, and especially on Sequence 08, where there are many reverse loop closures.
  • Figure 5: Performance change with varying number of nodes in semantic graph.
  • ...and 1 more figures