Table of Contents
Fetching ...

DeepUKF-VIN: Adaptively-tuned Deep Unscented Kalman Filter for 3D Visual-Inertial Navigation based on IMU-Vision-Net

Khashayar Ghanizadegan, Hashim A. Hashim

TL;DR

This work tackles GPS-denied 3D Visual-Inertial Navigation by developing DeepUKF-VIN, a quaternion-based UKF augmented with a Deep Learning-based Adaptation Mechanism (DLAM) to adapt noise covariances in real time. DLAM comprises IMU-Net and Vision-Net, which respectively estimate covariance scaling factors from IMU sequences and stereo images, enabling dynamic tuning of $C_{ m etc}$ to improve estimation of orientation $q$, position $p$, and velocity $v$. The method is trained on EuRoC data and demonstrated to outperform a standard UKF and a DeepEKF in accuracy and stability, particularly under low sampling rates and GNSS-denied conditions. The results suggest strong practical impact for robust, low-cost VIN in robotics and aerial platforms, with potential extensions to additional sensors and filters.

Abstract

This paper addresses the challenge of estimating the orientation, position, and velocity of a vehicle operating in three-dimensional (3D) space with six degrees of freedom (6-DoF). A Deep Learning-based Adaptation Mechanism (DLAM) is proposed to adaptively tune the noise covariance matrices of Kalman-type filters for the Visual-Inertial Navigation (VIN) problem, leveraging IMU-Vision-Net. Subsequently, an adaptively tuned Deep Learning Unscented Kalman Filter for 3D VIN (DeepUKF-VIN) is introduced to utilize the proposed DLAM, thereby robustly estimating key navigation components, including orientation, position, and linear velocity. The proposed DeepUKF-VIN integrates data from onboard sensors, specifically an inertial measurement unit (IMU) and visual feature points extracted from a camera, and is applicable for GPS-denied navigation. Its quaternion-based design effectively captures navigation nonlinearities and avoids the singularities commonly encountered with Euler-angle-based filters. Implemented in discrete space, the DeepUKF-VIN facilitates practical filter deployment. The filter's performance is evaluated using real-world data collected from an IMU and a stereo camera at low sampling rates. The results demonstrate filter stability and rapid attenuation of estimation errors, highlighting its high estimation accuracy. Furthermore, comparative testing against the standard Unscented Kalman Filter (UKF) in two scenarios consistently shows superior performance across all navigation components, thereby validating the efficacy and robustness of the proposed DeepUKF-VIN. Keywords: Deep Learning, Unscented Kalman Filter, Adaptive tuning, Estimation, Navigation, Unmanned Aerial Vehicle, Sensor-fusion.

DeepUKF-VIN: Adaptively-tuned Deep Unscented Kalman Filter for 3D Visual-Inertial Navigation based on IMU-Vision-Net

TL;DR

This work tackles GPS-denied 3D Visual-Inertial Navigation by developing DeepUKF-VIN, a quaternion-based UKF augmented with a Deep Learning-based Adaptation Mechanism (DLAM) to adapt noise covariances in real time. DLAM comprises IMU-Net and Vision-Net, which respectively estimate covariance scaling factors from IMU sequences and stereo images, enabling dynamic tuning of to improve estimation of orientation , position , and velocity . The method is trained on EuRoC data and demonstrated to outperform a standard UKF and a DeepEKF in accuracy and stability, particularly under low sampling rates and GNSS-denied conditions. The results suggest strong practical impact for robust, low-cost VIN in robotics and aerial platforms, with potential extensions to additional sensors and filters.

Abstract

This paper addresses the challenge of estimating the orientation, position, and velocity of a vehicle operating in three-dimensional (3D) space with six degrees of freedom (6-DoF). A Deep Learning-based Adaptation Mechanism (DLAM) is proposed to adaptively tune the noise covariance matrices of Kalman-type filters for the Visual-Inertial Navigation (VIN) problem, leveraging IMU-Vision-Net. Subsequently, an adaptively tuned Deep Learning Unscented Kalman Filter for 3D VIN (DeepUKF-VIN) is introduced to utilize the proposed DLAM, thereby robustly estimating key navigation components, including orientation, position, and linear velocity. The proposed DeepUKF-VIN integrates data from onboard sensors, specifically an inertial measurement unit (IMU) and visual feature points extracted from a camera, and is applicable for GPS-denied navigation. Its quaternion-based design effectively captures navigation nonlinearities and avoids the singularities commonly encountered with Euler-angle-based filters. Implemented in discrete space, the DeepUKF-VIN facilitates practical filter deployment. The filter's performance is evaluated using real-world data collected from an IMU and a stereo camera at low sampling rates. The results demonstrate filter stability and rapid attenuation of estimation errors, highlighting its high estimation accuracy. Furthermore, comparative testing against the standard Unscented Kalman Filter (UKF) in two scenarios consistently shows superior performance across all navigation components, thereby validating the efficacy and robustness of the proposed DeepUKF-VIN. Keywords: Deep Learning, Unscented Kalman Filter, Adaptive tuning, Estimation, Navigation, Unmanned Aerial Vehicle, Sensor-fusion.

Paper Structure

This paper contains 29 sections, 57 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: IMU-Net Architecture Schematics
  • Figure 2: Vision-Net Architecture Schematics
  • Figure 3: Summary schematic architecture of quaternion-based DeepUKF-VIN. First, the Aggregate Predict step of the filter is executed, incorporating the last known state information, aggregated IMU data, and the IMU noise covariance computed by IMU-Net. Next, the Update step is performed using the predicted state information and the vision covariance matrix estimated by Vision-Net. Raw IMU and vision data are used as inputs to IMU-Net and Vision-Net, respectively.
  • Figure 4: Training loss convergence over 30 epochs.
  • Figure 5: Matched feature points between the left and right images of a set of stereo image measurements using EuRoC dataset Burri25012016.
  • ...and 2 more figures