MICP-L: Mesh-based ICP for Robot Localization using Hardware-Accelerated Ray Casting
Alexander Mock, Sebastian Pütz, Thomas Wiemann, Joachim Hertzberg
TL;DR
MICP-L presents a robust, real-time method for localizing a robot in triangle mesh maps by combining ray casting correspondences (RCC) with traditional closest-point matching. The pipeline supports multiple range sensors and is designed to run efficiently on CPUs or GPUs, including NVIDIA RTX hardware, by leveraging covariance-reduction techniques and Embree/RTX-accelerated ray casting via the Rmagine library. Key contributions include (1) hardware-accelerated RCC for accurate mesh-to-sensor registration, (2) a reduction-based covariance computation that enables parallelized, multi-sensor fusion, and (3) extensive real-world validation across agriculture, automotive, and drone domains with favorable localization metrics and real-time performance. The work demonstrates significant advantages over classic closest-point methods in indoor scenarios and shows competitive performance with mesh-based baselines, with practical impact for mesh-based navigation and autonomous operation in GPS-denied environments.
Abstract
Triangle mesh maps are a versatile 3D environment representation for robots to navigate in challenging indoor and outdoor environments exhibiting tunnels, hills and varying slopes. To make use of these mesh maps, methods are needed to accurately localize robots in such maps to perform essential tasks like path planning and navigation. We present Mesh ICP Localization (MICP-L), a novel and computationally efficient method for registering one or more range sensors to a triangle mesh map to continuously localize a robot in 6D, even in GPS-denied environments. We accelerate the computation of ray casting correspondences (RCC) between range sensors and mesh maps by supporting different parallel computing devices like multicore CPUs, GPUs and the latest NVIDIA RTX hardware. By additionally transforming the covariance computation into a reduction operation, we can optimize the initial guessed poses in parallel on CPUs or GPUs, making our implementation applicable in real-time on many architectures. We demonstrate the robustness of our localization approach with datasets from agricultural, aerial, and automotive domains.
