Table of Contents
Fetching ...

DEIO: Deep Event Inertial Odometry

Weipeng Guan, Fuling Lin, Peiyu Chen, Peng Lu

TL;DR

DEIO tackles robust monocular event-based odometry by uniting a learning-based event data association with IMU-driven optimization in a graph-based backend. The approach uses an event patch network with a differentiable bundle adjustment layer to produce sparse, high-confidence correspondences, whose Hessian information is embedded into a patch-based co-visibility factor graph that also incorporates IMU pre-integration. Training is performed offline to learn robust data associations, while online optimization fuses learned information with IMU constraints to produce up-to-scale 6-DoF poses within a keyframe sliding window. Comprehensive evaluations on ten real-world benchmarks show DEIO outperforming more than 20 state-of-the-art methods, including strong gains in challenging lighting, high-speed, and low-texture scenes, and across diverse platforms, with real-time performance and evidence of good generalization. The work demonstrates the practicality of learning-optimization hybrids for event-based SLAM, and provides code and datasets to foster further research.

Abstract

Event cameras show great potential for visual odometry (VO) in handling challenging situations, such as fast motion and high dynamic range. Despite this promise, the sparse and motion-dependent characteristics of event data continue to limit the performance of feature-based or direct-based data association methods in practical applications. To address these limitations, we propose Deep Event Inertial Odometry (DEIO), the first monocular learning-based event-inertial framework, which combines a learning-based method with traditional nonlinear graph-based optimization. Specifically, an event-based recurrent network is adopted to provide accurate and sparse associations of event patches over time. DEIO further integrates it with the IMU to recover up-to-scale pose and provide robust state estimation. The Hessian information derived from the learned differentiable bundle adjustment (DBA) is utilized to optimize the co-visibility factor graph, which tightly incorporates event patch correspondences and IMU pre-integration within a keyframe-based sliding window. Comprehensive validations demonstrate that DEIO achieves superior performance on \textit{10} challenging public benchmarks compared with more than 20 state-of-the-art methods.

DEIO: Deep Event Inertial Odometry

TL;DR

DEIO tackles robust monocular event-based odometry by uniting a learning-based event data association with IMU-driven optimization in a graph-based backend. The approach uses an event patch network with a differentiable bundle adjustment layer to produce sparse, high-confidence correspondences, whose Hessian information is embedded into a patch-based co-visibility factor graph that also incorporates IMU pre-integration. Training is performed offline to learn robust data associations, while online optimization fuses learned information with IMU constraints to produce up-to-scale 6-DoF poses within a keyframe sliding window. Comprehensive evaluations on ten real-world benchmarks show DEIO outperforming more than 20 state-of-the-art methods, including strong gains in challenging lighting, high-speed, and low-texture scenes, and across diverse platforms, with real-time performance and evidence of good generalization. The work demonstrates the practicality of learning-optimization hybrids for event-based SLAM, and provides code and datasets to foster further research.

Abstract

Event cameras show great potential for visual odometry (VO) in handling challenging situations, such as fast motion and high dynamic range. Despite this promise, the sparse and motion-dependent characteristics of event data continue to limit the performance of feature-based or direct-based data association methods in practical applications. To address these limitations, we propose Deep Event Inertial Odometry (DEIO), the first monocular learning-based event-inertial framework, which combines a learning-based method with traditional nonlinear graph-based optimization. Specifically, an event-based recurrent network is adopted to provide accurate and sparse associations of event patches over time. DEIO further integrates it with the IMU to recover up-to-scale pose and provide robust state estimation. The Hessian information derived from the learned differentiable bundle adjustment (DBA) is utilized to optimize the co-visibility factor graph, which tightly incorporates event patch correspondences and IMU pre-integration within a keyframe-based sliding window. Comprehensive validations demonstrate that DEIO achieves superior performance on \textit{10} challenging public benchmarks compared with more than 20 state-of-the-art methods.

Paper Structure

This paper contains 19 sections, 13 equations, 10 figures, 10 tables.

Figures (10)

  • Figure 1: Overview of the DEIO system. It decouples network training from IMU integration and operates in two phases: offline training and online optimization. The main innovations of this work reside in the effective integration of IMU measurements with learning-based methods. During training, a unified event-based optical flow network is trained to provide robust data associations of sparse event patches. At runtime, the Hessian information, derived from the DBA layer in the update operator, is utilized to tightly integrate event patch correspondence with IMU pre-integration through an event patch-based co-visibility factor graph optimization.
  • Figure 2: Patch-based co-visibility factor graph for event-IMU combined bundle adjustment.
  • Figure 3: Comparison of the estimated position (X, Y, Z) and orientation (Roll, Pitch, Yaw) results of our DEIO with DEVO DEVO in the sequence of (a) boxes_6dof and (b) poster_6dof from the DAVIA240c dataset GWPHKU:event-camera-dataset_davis240c. The DEIO efficiently converts scale ambiguity and aligns closely with the ground truth trajectory.
  • Figure 4: Comparison of the estimated trajectories (in X, Y, Z, and XY-plane) with DEVO DEVO in the Mono & Stereo HKU dataset GWPHKU:MyEVIOESVIO. The DEIO seamlessly addresses scale ambiguity and demonstrates precise alignment with the ground truth trajectory. In contrast, the baseline estimates exhibit significant scale discrepancies: (a) The baseline trajectory suffers from drift and an overestimated scale. (b) The baseline trajectory shows an underestimated scale.
  • Figure 5: The estimated trajectories of our DEIO against the GT in the sequence of ziggy_hdr and rocket_dark from the EDS EDS dataset. The image view (visualization-only) demonstrates the lack of perceptible information under low-light conditions, while the event view, though perceptible, remains susceptible to interference from the infrared light of the motion capture system. Thanks to our robust learning-based event data association, the trajectories estimated by DEIO align remarkably closely with the GT.
  • ...and 5 more figures