SP-VIO: Robust and Efficient Filter-Based Visual Inertial Odometry with State Transformation Model and Pose-Only Visual Description
Xueyu Du, Lilian Zhang, Chengjun Ji, Xinchan Luo, Huaiyi Zhang, Maosong Wang, Wenqi Wu, Jun Mao
TL;DR
SP-VIO addresses the accuracy-efficiency trade-off in filter-based VIO by replacing Std-EKF with DST-EKF and adopting a pose-only visual description to decouple measurements from 3D feature reconstructions. It further strengthens robustness with a DST-RTS backtracking strategy that corrects motion trajectories after visual interruptions without relying on loop closures, underpinned by observability analyses showing a more stable unobservable subspace. Empirical results on Monte-Carlo simulations, EuRoC, TUM-VI, KITTI, and a personal Nudt-VI dataset demonstrate superior localization accuracy and robustness, while maintaining real-time performance close to OpenVINS. The proposed SP-VIO is particularly advantageous for payload-constrained platforms and scenarios with visual deprivation, suggesting strong potential for embedded navigation and future SLAM extensions.
Abstract
Due to the advantages of high computational efficiency and small memory requirements, filter-based visual inertial odometry (VIO) has a good application prospect in miniaturized and payload-constrained embedded systems. However, the filter-based method has the problem of insufficient accuracy. To this end, we propose the State transformation and Pose-only VIO (SP-VIO) by rebuilding the state and measurement models, and considering further visual deprived conditions. In detail, we first proposed the double state transformation extended Kalman filter (DST-EKF) to replace the standard extended Kalman filter (Std-EKF) for improving the system's consistency, and then adopt pose-only (PO) visual description to avoid the linearization error caused by 3D feature estimation. The comprehensive observability analysis shows that SP-VIO has a more stable unobservable subspace, which can better avoid the inconsistency problem caused by spurious information. Moreover, we propose an enhanced double state transformation Rauch-Tung-Striebel (DST-RTS) backtracking method to optimize motion trajectories during visual interruption. Monte-Carlo simulations and real-world experiments show that SP-VIO has better accuracy and efficiency than state-of-the-art (SOTA) VIO algorithms, and has better robustness under visual deprived conditions.
