ES-PTAM: Event-based Stereo Parallel Tracking and Mapping
Suman Ghosh, Valentina Cavinato, Guillermo Gallego
TL;DR
This work tackles robust visual odometry and SLAM under challenging conditions using event-based stereo cameras. It introduces ES-PTAM, a parallel-tracking-and-mapping system that couples an improved ray-density fusion mapper with an edge-map based tracker, operating directly on event streams. The approach scales to multi-camera setups and is validated on five real-world datasets, including a trinocular EVIMO2 sequence, where it frequently outperforms state-of-the-art ESVO and EVO in pose accuracy and yields sharper semi-dense maps. The authors provide extensive qualitative and quantitative results and release the open-source implementation to foster community advancement in event-based perception. Overall, this work advances purely event-based VO/SLAM, with potential impact for autonomous vehicles and mobile robots operating in HDR and high-speed environments.
Abstract
Visual Odometry (VO) and SLAM are fundamental components for spatial perception in mobile robots. Despite enormous progress in the field, current VO/SLAM systems are limited by their sensors' capability. Event cameras are novel visual sensors that offer advantages to overcome the limitations of standard cameras, enabling robots to expand their operating range to challenging scenarios, such as high-speed motion and high dynamic range illumination. We propose a novel event-based stereo VO system by combining two ideas: a correspondence-free mapping module that estimates depth by maximizing ray density fusion and a tracking module that estimates camera poses by maximizing edge-map alignment. We evaluate the system comprehensively on five real-world datasets, spanning a variety of camera types (manufacturers and spatial resolutions) and scenarios (driving, flying drone, hand-held, egocentric, etc). The quantitative and qualitative results demonstrate that our method outperforms the state of the art in majority of the test sequences by a margin, e.g., trajectory error reduction of 45% on RPG dataset, 61% on DSEC dataset, and 21% on TUM-VIE dataset. To benefit the community and foster research on event-based perception systems, we release the source code and results: https://github.com/tub-rip/ES-PTAM
