XR-VIO: High-precision Visual Inertial Odometry with Fast Initialization for XR Applications
Shangjin Zhai, Nan Wang, Xiaomeng Wang, Danpeng Chen, Weijian Xie, Hujun Bao, Guofeng Zhang
TL;DR
The paper tackles the challenge of robust, fast visual inertial odometry for XR by improving initialization and feature matching. It introduces XR-VIO, featuring a gyro-informed VG-SfM initialization pipeline (VG-SfM, VA-Align, VI-BA) that can initialize from as few as four frames, and a hybrid feature matching strategy that fuses optical flow with descriptor-based matching for stable, long tracks. The approach achieves state-of-the-art accuracy on multiple public benchmarks (e.g., EuRoC, ZJU-Sensetime) and runs in real time on mobile devices, enabling practical AR/VR deployment. Experimental results, including ablations and mobile AR demos, demonstrate increased initialization success, reduced drift, and robust performance across motion types, while acknowledging limitations in extreme or dynamic environments and suggesting future learning-based extensions.
Abstract
This paper presents a novel approach to Visual Inertial Odometry (VIO), focusing on the initialization and feature matching modules. Existing methods for initialization often suffer from either poor stability in visual Structure from Motion (SfM) or fragility in solving a huge number of parameters simultaneously. To address these challenges, we propose a new pipeline for visual inertial initialization that robustly handles various complex scenarios. By tightly coupling gyroscope measurements, we enhance the robustness and accuracy of visual SfM. Our method demonstrates stable performance even with only four image frames, yielding competitive results. In terms of feature matching, we introduce a hybrid method that combines optical flow and descriptor-based matching. By leveraging the robustness of continuous optical flow tracking and the accuracy of descriptor matching, our approach achieves efficient, accurate, and robust tracking results. Through evaluation on multiple benchmarks, our method demonstrates state-of-the-art performance in terms of accuracy and success rate. Additionally, a video demonstration on mobile devices showcases the practical applicability of our approach in the field of Augmented Reality/Virtual Reality (AR/VR).
