EVLoc: Event-based Visual Localization in LiDAR Maps via Event-Depth Registration
Kuangyi Chen, Jun Zhang, Friedrich Fraundorfer
TL;DR
EVLoc addresses robust 6-DoF localization by aligning event frames with depth maps derived from existing LiDAR maps using a RAFT-based event-depth flow estimator. It introduces a frame-based event representation (Temporal-Spatial stable Time Surface, TSTS) and an Offset Alleviation Module to compensate ground-truth bias, enabling reliable 2D-3D correspondences for PnP pose estimation under challenging motion and lighting ($Δt$ window). The approach yields accurate pose refinement on indoor/outdoor LiDAR-map sequences, outperforming a conventional image-based baseline in high dynamic range and motion scenarios. By relying on LiDAR maps as references and providing open-source code and models, EVLoc enhances scalability and practical deployment for autonomous systems.
Abstract
Event cameras are bio-inspired sensors with some notable features, including high dynamic range and low latency, which makes them exceptionally suitable for perception in challenging scenarios such as high-speed motion and extreme lighting conditions. In this paper, we explore their potential for localization within pre-existing LiDAR maps, a critical task for applications that require precise navigation and mobile manipulation. Our framework follows a paradigm based on the refinement of an initial pose. Specifically, we first project LiDAR points into 2D space based on a rough initial pose to obtain depth maps, and then employ an optical flow estimation network to align events with LiDAR points in 2D space, followed by camera pose estimation using a PnP solver. To enhance geometric consistency between these two inherently different modalities, we develop a novel frame-based event representation that improves structural clarity. Additionally, given the varying degrees of bias observed in the ground truth poses, we design a module that predicts an auxiliary variable as a regularization term to mitigate the impact of this bias on network convergence. Experimental results on several public datasets demonstrate the effectiveness of our proposed method. To facilitate future research, both the code and the pre-trained models are made available online.
