CMRNext: Camera to LiDAR Matching in the Wild for Localization and Extrinsic Calibration
Daniele Cattaneo, Abhinav Valada
TL;DR
CMRNext tackles the challenge of cross-modal camera-LiDAR matching for monocular localization in LiDAR maps and extrinsic calibration by decoupling dense pixel-to-3D matching from pose estimation. It uses a RAFT-based network to predict per-pixel correspondences and uncertainties, then solves the PnP problem with RANSAC to recover the camera pose, enabling generalization across unseen sensor setups without retraining. The method demonstrates state-of-the-art results on multiple public datasets and three in-house platforms, with substantial gains from iterative refinement and temporal aggregation for calibration. The work provides open-source code and models, highlighting practical impact for scalable, camera-based localization in LiDAR-supported environments.
Abstract
LiDARs are widely used for mapping and localization in dynamic environments. However, their high cost limits their widespread adoption. On the other hand, monocular localization in LiDAR maps using inexpensive cameras is a cost-effective alternative for large-scale deployment. Nevertheless, most existing approaches struggle to generalize to new sensor setups and environments, requiring retraining or fine-tuning. In this paper, we present CMRNext, a novel approach for camera-LIDAR matching that is independent of sensor-specific parameters, generalizable, and can be used in the wild for monocular localization in LiDAR maps and camera-LiDAR extrinsic calibration. CMRNext exploits recent advances in deep neural networks for matching cross-modal data and standard geometric techniques for robust pose estimation. We reformulate the point-pixel matching problem as an optical flow estimation problem and solve the Perspective-n-Point problem based on the resulting correspondences to find the relative pose between the camera and the LiDAR point cloud. We extensively evaluate CMRNext on six different robotic platforms, including three publicly available datasets and three in-house robots. Our experimental evaluations demonstrate that CMRNext outperforms existing approaches on both tasks and effectively generalizes to previously unseen environments and sensor setups in a zero-shot manner. We make the code and pre-trained models publicly available at http://cmrnext.cs.uni-freiburg.de .
