Table of Contents
Fetching ...

ESVIO: Event-based Stereo Visual Inertial Odometry

Peiyu Chen, Weipeng Guan, Peng Lu

TL;DR

ESVIO tackles robust pose estimation under challenging HDR and fast-motion by introducing a stereo event-based VIO that fuses stereo events, standard images, and IMU in a sliding-window graph optimization. It introduces motion compensation to align events to reference times using IMU angular velocity and estimated velocity, followed by geometry-driven spatial and temporal data associations and depth estimation via triangulation; the graph-based optimization jointly minimizes IMU, event-based, and image residuals with inverse-depth feature representations. The paper presents two variants: ESIO (purely event-based) and ESVIO (image-aided), with substantial improvements over state-of-the-art image-based and event-based baselines on public and self-collected data, plus onboard quadrotor flights and a large-scale outdoor evaluation. This work advances real-time, robust state estimation in HDR and aggressive-motion environments, enabling reliable autonomous operation in challenging conditions.

Abstract

Event cameras that asynchronously output low-latency event streams provide great opportunities for state estimation under challenging situations. Despite event-based visual odometry having been extensively studied in recent years, most of them are based on monocular and few research on stereo event vision. In this paper, we present ESVIO, the first event-based stereo visual-inertial odometry, which leverages the complementary advantages of event streams, standard images and inertial measurements. Our proposed pipeline achieves temporal tracking and instantaneous matching between consecutive stereo event streams, thereby obtaining robust state estimation. In addition, the motion compensation method is designed to emphasize the edge of scenes by warping each event to reference moments with IMU and ESVIO back-end. We validate that both ESIO (purely event-based) and ESVIO (event with image-aided) have superior performance compared with other image-based and event-based baseline methods on public and self-collected datasets. Furthermore, we use our pipeline to perform onboard quadrotor flights under low-light environments. A real-world large-scale experiment is also conducted to demonstrate long-term effectiveness. We highlight that this work is a real-time, accurate system that is aimed at robust state estimation under challenging environments.

ESVIO: Event-based Stereo Visual Inertial Odometry

TL;DR

ESVIO tackles robust pose estimation under challenging HDR and fast-motion by introducing a stereo event-based VIO that fuses stereo events, standard images, and IMU in a sliding-window graph optimization. It introduces motion compensation to align events to reference times using IMU angular velocity and estimated velocity, followed by geometry-driven spatial and temporal data associations and depth estimation via triangulation; the graph-based optimization jointly minimizes IMU, event-based, and image residuals with inverse-depth feature representations. The paper presents two variants: ESIO (purely event-based) and ESVIO (image-aided), with substantial improvements over state-of-the-art image-based and event-based baselines on public and self-collected data, plus onboard quadrotor flights and a large-scale outdoor evaluation. This work advances real-time, robust state estimation in HDR and aggressive-motion environments, enabling reliable autonomous operation in challenging conditions.

Abstract

Event cameras that asynchronously output low-latency event streams provide great opportunities for state estimation under challenging situations. Despite event-based visual odometry having been extensively studied in recent years, most of them are based on monocular and few research on stereo event vision. In this paper, we present ESVIO, the first event-based stereo visual-inertial odometry, which leverages the complementary advantages of event streams, standard images and inertial measurements. Our proposed pipeline achieves temporal tracking and instantaneous matching between consecutive stereo event streams, thereby obtaining robust state estimation. In addition, the motion compensation method is designed to emphasize the edge of scenes by warping each event to reference moments with IMU and ESVIO back-end. We validate that both ESIO (purely event-based) and ESVIO (event with image-aided) have superior performance compared with other image-based and event-based baseline methods on public and self-collected datasets. Furthermore, we use our pipeline to perform onboard quadrotor flights under low-light environments. A real-world large-scale experiment is also conducted to demonstrate long-term effectiveness. We highlight that this work is a real-time, accurate system that is aimed at robust state estimation under challenging environments.
Paper Structure (20 sections, 8 equations, 8 figures, 3 tables)

This paper contains 20 sections, 8 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Our ESVIO provides robust and accurate, real-time pose feedback for drones under aggressive motion. Events provide rich and reliable features, while only a few features are tracked in image frames in high-speed motion. Left bottom: stereo event-based feature tracking. Right bottom: stereo image-based feature tracking.
  • Figure 2: The structure of our ESVIO and ESIO pipeline.
  • Figure 3: The event streams without and with motion compensation.
  • Figure 4: Stereo event-corner features: (a) Geometry principle; (b) Temporally and spatially associating the event-corner features on the time surface; (c) Event-corner features tracking on the event streams.
  • Figure 5: Our self-designed quadrotor platform.
  • ...and 3 more figures