TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO

Chaoran Xiong; Guoqing Liu; Qi Wu; Songpengcheng Xia; Tong Hua; Kehui Ma; Zhen Sun; Yan Xiang; Ling Pei

TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO

Chaoran Xiong, Guoqing Liu, Qi Wu, Songpengcheng Xia, Tong Hua, Kehui Ma, Zhen Sun, Yan Xiang, Ling Pei

TL;DR

This paper introduces online time offset modeling networks (TON) to enhance real-time temporal calibration and proposes feature velocity observation networks to enhance velocity computation for features in unstable visual tracking conditions.

Abstract

Temporal misalignment (time offset) between sensors is common in low cost visual-inertial odometry (VIO) systems. Such temporal misalignment introduces inconsistent constraints for state estimation, leading to a significant positioning drift especially in high dynamic motion scenarios. In this article, we focus on online temporal calibration to reduce the positioning drift caused by the time offset for high dynamic motion VIO. For the time offset observation model, most existing methods rely on accurate state estimation or stable visual tracking. For the prediction model, current methods oversimplify the time offset as a constant value with white Gaussian noise. However, these ideal conditions are seldom satisfied in real high dynamic scenarios, resulting in the poor performance. In this paper, we introduce online time offset modeling networks (TON) to enhance real-time temporal calibration. TON improves the accuracy of time offset observation and prediction modeling. Specifically, for observation modeling, we propose feature velocity observation networks to enhance velocity computation for features in unstable visual tracking conditions. For prediction modeling, we present time offset prediction networks to learn its evolution pattern. To highlight the effectiveness of our method, we integrate the proposed TON into both optimization-based and filter-based VIO systems. Simulation and real-world experiments are conducted to demonstrate the enhanced performance of our approach. Additionally, to contribute to the VIO community, we will open-source the code of our method on: https://github.com/Franky-X/FVON-TPN.

TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO

TL;DR

Abstract

Paper Structure (26 sections, 16 equations, 14 figures, 5 tables, 1 algorithm)

This paper contains 26 sections, 16 equations, 14 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Classic Methods for Temporal Alignment and Calibration
Learning Methods for Temporal Calibration
Problem Formulation
Observation Model for Time Offset
Prediction Model for Time Offset
Methodology
System Overview
FVON: Feature Velocity Observation Networks
Inverse Time Series (ITS)-FVON
Frame-to-Frame (F2F)-FVON
TPN: Time offset Prediction Networks
Implementation of FVON and TPN in VIO
Networks Configuration
...and 11 more sections

Figures (14)

Figure 1: System overview of our proposed online weakly-supervised learning TON composed of FVON and TPN enhancing online temporal calibration. The system inputs are raw camera and IMU data with time-varying offset between them. First, the pre-integration for IMU data is implemented according to the received timestamps of camera and IMU. At the same time, the visual front-end extracts and tracks features between consecutive frames. Second, the IMU pre-integration results and front-end tracked features are incorporated into a solver to perform re-projection. During the solving process, the proposed weakly-supervised FVON and TPN enhance the observation and prediction modeling of time offset estimation.
Figure 2: The observed features in two consecutive frames. The blue points represent the features observed in $I^{k-1}$ frame, while the orange points represent the features observed in $I^{k}$ frame. The matching relationship is illustrated as arrows. The traditional features' velocity computation method is only capable of estimating the velocity of features which have previous observation, while failing to calculate the newly introduced features in $I^{k}$ frame due to lack of valid tracking in previous frame.
Figure 3: ITS-FVON architecture. The features' velocity of future frames is normalized to $[0,1]$ and then input to the ITS-FVON. The hidden state is used for passing the temporal relationship of future, present and previous features' velocity.The output is the prediction the features' velocity of previous frame.
Figure 4: ITS-FVON weakly-supervised training strategy. The blue points represent features which have previous observations. In other words, its velocity can be calculated with traditional method. We take these features as labels to train the network. The red points represent features of which the velocity cannot be calculated with traditional method due to the lack of previous valid observation. The trained network is used for its prediction.
Figure 5: F2F-FVON weakly-supervised training strategy. The blue points represent features with previous observation. Their velocity can be calculated with traditional method. These features are taken as labels to train the network. The orange points represent features of which the velocity cannot be calculated with traditional method due to the lack of valid observation. The trained F2F-FVON is used for their prediction.
...and 9 more figures

TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO

TL;DR

Abstract

TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO

Authors

TL;DR

Abstract

Table of Contents

Figures (14)