Tight Fusion of Events and Inertial Measurements for Direct Velocity Estimation
Wanting Xu, Xin Peng, Laurent Kneip
TL;DR
This work introduces a velocity-centric fusion of event-camera measurements with inertial data to directly estimate camera velocity, avoiding reliance on absolute pose or map coordinates. By employing a continuous event-line constraint (CELC) based on trifocal tensor geometry, it links asynchronous events from moving lines to translational velocity, and uses a nested two-layer RANSAC with a novel 4-event line solver to bootstrap velocity. The back-end then tightly couples event measurements with IMU pre-integration in a sliding window, including a consistency term to enforce cross-slice line coherence. Experiments on synthetic and real data show continuous, robust velocity estimates in highly dynamic conditions, with superior or more stable kinematic estimates than conventional frame-based visual-inertial odometry, especially when motion blur or low texture degrades images. This velocity-first paradigm offers a fail-safe sensor-fusion pathway for fast-acting control in UAVs and dynamic platforms, and opens avenues for integrating lines and points in global mapping while maintaining robust velocity signals.
Abstract
Traditional visual-inertial state estimation targets absolute camera poses and spatial landmark locations while first-order kinematics are typically resolved as an implicitly estimated sub-state. However, this poses a risk in velocity-based control scenarios, as the quality of the estimation of kinematics depends on the stability of absolute camera and landmark coordinates estimation. To address this issue, we propose a novel solution to tight visual-inertial fusion directly at the level of first-order kinematics by employing a dynamic vision sensor instead of a normal camera. More specifically, we leverage trifocal tensor geometry to establish an incidence relation that directly depends on events and camera velocity, and demonstrate how velocity estimates in highly dynamic situations can be obtained over short time intervals. Noise and outliers are dealt with using a nested two-layer RANSAC scheme. Additionally, smooth velocity signals are obtained from a tight fusion with pre-integrated inertial signals using a sliding window optimizer. Experiments on both simulated and real data demonstrate that the proposed tight event-inertial fusion leads to continuous and reliable velocity estimation in highly dynamic scenarios independently of absolute coordinates. Furthermore, in extreme cases, it achieves more stable and more accurate estimation of kinematics than traditional, point-position-based visual-inertial odometry.
