Table of Contents
Fetching ...

Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph

Zherong Zhang, Chunyu Lin, Shujuan Huang, Shangrong Yang, Yao Zhao

TL;DR

A geometric correspondence graph neural network that integrates point features with extra structured line segments that enhances model performance across different environments and is competitive with state-of-the-art techniques.

Abstract

Relative pose estimation is crucial for various computer vision applications, including Robotic and Autonomous Driving. Current methods primarily depend on selecting and matching feature points prone to incorrect matches, leading to poor performance. Consequently, relying solely on point-matching relationships for pose estimation is a huge challenge. To overcome these limitations, we propose a Geometric Correspondence Graph neural network that integrates point features with extra structured line segments. This integration of matched points and line segments further exploits the geometry constraints and enhances model performance across different environments. We employ the Dual-Graph module and Feature Weighted Fusion Module to aggregate geometric and visual features effectively, facilitating complex scene understanding. We demonstrate our approach through extensive experiments on the DeMoN and KITTI Odometry datasets. The results show that our method is competitive with state-of-the-art techniques.

Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph

TL;DR

A geometric correspondence graph neural network that integrates point features with extra structured line segments that enhances model performance across different environments and is competitive with state-of-the-art techniques.

Abstract

Relative pose estimation is crucial for various computer vision applications, including Robotic and Autonomous Driving. Current methods primarily depend on selecting and matching feature points prone to incorrect matches, leading to poor performance. Consequently, relying solely on point-matching relationships for pose estimation is a huge challenge. To overcome these limitations, we propose a Geometric Correspondence Graph neural network that integrates point features with extra structured line segments. This integration of matched points and line segments further exploits the geometry constraints and enhances model performance across different environments. We employ the Dual-Graph module and Feature Weighted Fusion Module to aggregate geometric and visual features effectively, facilitating complex scene understanding. We demonstrate our approach through extensive experiments on the DeMoN and KITTI Odometry datasets. The results show that our method is competitive with state-of-the-art techniques.
Paper Structure (28 sections, 22 equations, 5 figures, 5 tables)

This paper contains 28 sections, 22 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: The architecture of our proposed Str-L Pose. It contains three components: Feature Encoder(SCE, LSE, PE), Dual-Graph Architecture, and Feature Weighted Fusion Module. In the Dual-Graph architecture, the Geometric Correspondence Graph employs spatial coordinate encoding to articulate geometric relationships and facilitate accurate pose estimation through structured line segment integration, while the Geometry-Guided Visual Graph extracts and processes visual features from the input image. Feature Weighted Fusion Module harmonizes geometric and visual features.
  • Figure 2: The trajectory using our method on (a) Seq. 09 and (b) Seq. 10 of KITTI Odometry datasets.
  • Figure 3: Comparison of trajectories of our method with the depth-constrained methods and other high-precision LiDAR SLAM methods on Seq. 11, Seq 12, Seq. 14 and Seq. 15 of the KITTI odometry dataset.
  • Figure 4: Visualization of line segment matching results for image pairs whose Full Model translation results is worse than Baseline + V on the RGB-D subset. The red circles indicate matching line segments with obvious errors in line segment length.
  • Figure 5: Visualization of line segment matching results of image pairs where the Full Model translation results is better than Baseline + V on the RGB-D subset.