Table of Contents
Fetching ...

SP-VIO: Robust and Efficient Filter-Based Visual Inertial Odometry with State Transformation Model and Pose-Only Visual Description

Xueyu Du, Lilian Zhang, Chengjun Ji, Xinchan Luo, Huaiyi Zhang, Maosong Wang, Wenqi Wu, Jun Mao

TL;DR

SP-VIO addresses the accuracy-efficiency trade-off in filter-based VIO by replacing Std-EKF with DST-EKF and adopting a pose-only visual description to decouple measurements from 3D feature reconstructions. It further strengthens robustness with a DST-RTS backtracking strategy that corrects motion trajectories after visual interruptions without relying on loop closures, underpinned by observability analyses showing a more stable unobservable subspace. Empirical results on Monte-Carlo simulations, EuRoC, TUM-VI, KITTI, and a personal Nudt-VI dataset demonstrate superior localization accuracy and robustness, while maintaining real-time performance close to OpenVINS. The proposed SP-VIO is particularly advantageous for payload-constrained platforms and scenarios with visual deprivation, suggesting strong potential for embedded navigation and future SLAM extensions.

Abstract

Due to the advantages of high computational efficiency and small memory requirements, filter-based visual inertial odometry (VIO) has a good application prospect in miniaturized and payload-constrained embedded systems. However, the filter-based method has the problem of insufficient accuracy. To this end, we propose the State transformation and Pose-only VIO (SP-VIO) by rebuilding the state and measurement models, and considering further visual deprived conditions. In detail, we first proposed the double state transformation extended Kalman filter (DST-EKF) to replace the standard extended Kalman filter (Std-EKF) for improving the system's consistency, and then adopt pose-only (PO) visual description to avoid the linearization error caused by 3D feature estimation. The comprehensive observability analysis shows that SP-VIO has a more stable unobservable subspace, which can better avoid the inconsistency problem caused by spurious information. Moreover, we propose an enhanced double state transformation Rauch-Tung-Striebel (DST-RTS) backtracking method to optimize motion trajectories during visual interruption. Monte-Carlo simulations and real-world experiments show that SP-VIO has better accuracy and efficiency than state-of-the-art (SOTA) VIO algorithms, and has better robustness under visual deprived conditions.

SP-VIO: Robust and Efficient Filter-Based Visual Inertial Odometry with State Transformation Model and Pose-Only Visual Description

TL;DR

SP-VIO addresses the accuracy-efficiency trade-off in filter-based VIO by replacing Std-EKF with DST-EKF and adopting a pose-only visual description to decouple measurements from 3D feature reconstructions. It further strengthens robustness with a DST-RTS backtracking strategy that corrects motion trajectories after visual interruptions without relying on loop closures, underpinned by observability analyses showing a more stable unobservable subspace. Empirical results on Monte-Carlo simulations, EuRoC, TUM-VI, KITTI, and a personal Nudt-VI dataset demonstrate superior localization accuracy and robustness, while maintaining real-time performance close to OpenVINS. The proposed SP-VIO is particularly advantageous for payload-constrained platforms and scenarios with visual deprivation, suggesting strong potential for embedded navigation and future SLAM extensions.

Abstract

Due to the advantages of high computational efficiency and small memory requirements, filter-based visual inertial odometry (VIO) has a good application prospect in miniaturized and payload-constrained embedded systems. However, the filter-based method has the problem of insufficient accuracy. To this end, we propose the State transformation and Pose-only VIO (SP-VIO) by rebuilding the state and measurement models, and considering further visual deprived conditions. In detail, we first proposed the double state transformation extended Kalman filter (DST-EKF) to replace the standard extended Kalman filter (Std-EKF) for improving the system's consistency, and then adopt pose-only (PO) visual description to avoid the linearization error caused by 3D feature estimation. The comprehensive observability analysis shows that SP-VIO has a more stable unobservable subspace, which can better avoid the inconsistency problem caused by spurious information. Moreover, we propose an enhanced double state transformation Rauch-Tung-Striebel (DST-RTS) backtracking method to optimize motion trajectories during visual interruption. Monte-Carlo simulations and real-world experiments show that SP-VIO has better accuracy and efficiency than state-of-the-art (SOTA) VIO algorithms, and has better robustness under visual deprived conditions.

Paper Structure

This paper contains 28 sections, 35 equations, 14 figures, 7 tables, 2 algorithms.

Figures (14)

  • Figure 1: This figure illustrating the full pipeline of the proposed SP-VIO, and part of the performance test results of this algorithm on popular public datasets, including trajectory errors and time costs. (a) Framework of SP-VIO. (b) Trajectory error on EuRoC WOS:000382981300001 and Tum VI 8593419. (c) Average runtime on EuRoC and Tum VI.
  • Figure 2: Partial result trajectories in public datasets, compared with OpenVINS and VINS-Mono. (a) EuRoC MH_04_difficult. (b) EuRoC V1_03_difficult. (c) Tum VI Room3_512_16.
  • Figure 3: Comparison of the estimated trajectory of three algorithms with the ground truth in nudt_car, the total distance of the trajectory is 4483.54m, and the percentage of localization error on VINS-Mono, OpenVINS, and SP-VIO are 1.62%, 1.11%, and 0.29%, repectively.
  • Figure 4: The output trajectory of MSCKF under visual deprived condition. (a) estimated state reconvergence. (b) estimated state divergence.
  • Figure 5: The output trajectory of VINS-Mono under visual deprived condition. (a) estimated state reconvergence. (b) estimated state divergence.
  • ...and 9 more figures