sqrtVINS: Robust and Ultrafast Square-Root Filter-based 3D Motion Tracking
Yuxiang Peng, Chuchu Chen, Kejian Wu, Guoquan Huang
TL;DR
This work develops sqrtVINS, a square-root covariance filtering-based visual-inertial system designed for embedded, 32-bit platforms. It introduces an LLT-based SRF update that preserves the triangular structure and delivers substantial computational gains, paired with an ultrafast dynamic initialization that achieves robust results within 100 ms using only 3 keyframes. The system couples a structure-aware, sliding-window SRF with stochastic cloning, feature management, online calibration, and an efficient feature-bearing measurement update, achieving over 2x speedups and improved numerical stability relative to state-of-the-art methods, validated on diverse datasets. The authors also provide a full open-source implementation, enabling rapid adoption and extension to multi-sensor and dynamic environments in real-world applications.
Abstract
In this paper, we develop and open-source, for the first time, a square-root filter (SRF)-based visual-inertial navigation system (VINS), termed sqrtVINS, which is ultra-fast, numerically stable, and capable of dynamic initialization even under extreme conditions (i.e., extremely small time window). Despite recent advancements in VINS, resource constraints and numerical instability on embedded (robotic) systems with limited precision remain critical challenges. A square-root covariance-based filter offers a promising solution by providing numerical stability, efficient memory usage, and guaranteed positive semi-definiteness. However, canonical SRFs suffer from inefficiencies caused by disruptions in the triangular structure of the covariance matrix during updates. The proposed method significantly improves VINS efficiency with a novel Cholesky decomposition (LLT)-based SRF update, by fully exploiting the system structure to preserve the structure. Moreover, we design a fast, robust, dynamic initialization method, which first recovers the minimal states without triangulating 3D features and then efficiently performs iterative SRF update to refine the full states, enabling seamless VINS operation. The proposed LLT-based SRF is extensively verified through numerical studies, demonstrating superior numerical stability and achieving robust efficient performance on 32-bit single-precision floats, operating at twice the speed of state-of-the-art (SOTA) methods. Our initialization method, tested on both mobile workstations and Jetson Nano computers, achieving a high success rate of initialization even within a 100 ms window under minimal conditions. Finally, the proposed sqrtVINS is extensively validated across diverse scenarios, demonstrating strong efficiency, robustness, and reliability. The full open-source implementation is released to support future research and applications.
