Increasing SLAM Pose Accuracy by Ground-to-Satellite Image Registration
Yanhao Zhang, Yujiao Shi, Shan Wang, Ankit Vora, Akhil Perincherry, Yongbo Chen, Hongdong Li
TL;DR
The paper addresses long-term drift in vision-based SLAM for autonomous driving by introducing a fusion framework with ground-to-satellite (G2S) image registration. It combines a deep-learning-based G2S registration (BoostG2SLoc) with a coarse-to-fine G2S pose selection and a scaled pose-graph optimization that estimates per-frame scale factors $s_k$, producing drift-corrected trajectories. The method demonstrates improved translation and rotation accuracy on KITTI and FordAV datasets, with iterative trajectory refinement enhancing robustness. This work offers a practical path toward GPS-independent, globally-consistent localization suitable for real-world autonomous driving scenarios.
Abstract
Vision-based localization for autonomous driving has been of great interest among researchers. When a pre-built 3D map is not available, the techniques of visual simultaneous localization and mapping (SLAM) are typically adopted. Due to error accumulation, visual SLAM (vSLAM) usually suffers from long-term drift. This paper proposes a framework to increase the localization accuracy by fusing the vSLAM with a deep-learning-based ground-to-satellite (G2S) image registration method. In this framework, a coarse (spatial correlation bound check) to fine (visual odometry consistency check) method is designed to select the valid G2S prediction. The selected prediction is then fused with the SLAM measurement by solving a scaled pose graph problem. To further increase the localization accuracy, we provide an iterative trajectory fusion pipeline. The proposed framework is evaluated on two well-known autonomous driving datasets, and the results demonstrate the accuracy and robustness in terms of vehicle localization.
