SP-VINS: A Hybrid Stereo Visual Inertial Navigation System based on Implicit Environmental Map
Xueyu Du, Lilian Zhang, Fuan Duan, Xincan Luo, Maosong Wang, Wenqi Wu, JunMao
TL;DR
SP-VINS tackles long-term drift in filter-based visual-inertial navigation by replacing a 3D map with an implicit environment map built from keyframes and 2D keypoints. It fuses a hybrid residual framework that combines landmark reprojections and ray-depth constraints within a DST-EKF, and incorporates online camera-IMU extrinsic calibration to handle degraded environments. A loop-closure module leverages the implicit map, enabling drift correction without pose-graph optimization or 3D mapping, which boosts efficiency. Benchmark results on EuRoC, TUM-VI, and KAIST-Urban show SP-VINS achieves long-term, high-accuracy localization with lower computational overhead than state-of-the-art SLAM systems, making it well-suited for resource-constrained platforms.
Abstract
Filter-based visual inertial navigation system (VINS) has attracted mobile-robot researchers for the good balance between accuracy and efficiency, but its limited mapping quality hampers long-term high-accuracy state estimation. To this end, we first propose a novel filter-based stereo VINS, differing from traditional simultaneous localization and mapping (SLAM) systems based on 3D map, which performs efficient loop closure constraints with implicit environmental map composed of keyframes and 2D keypoints. Secondly, we proposed a hybrid residual filter framework that combines landmark reprojection and ray constraints to construct a unified Jacobian matrix for measurement updates. Finally, considering the degraded environment, we incorporated the camera-IMU extrinsic parameters into visual description to achieve online calibration. Benchmark experiments demonstrate that the proposed SP-VINS achieves high computational efficiency while maintaining long-term high-accuracy localization performance, and is superior to existing state-of-the-art (SOTA) methods.
