LiM-Loc: Visual Localization with Dense and Accurate 3D Reference Maps Directly Corresponding 2D Keypoints to 3D LiDAR Point Clouds

Masahiko Tsuji; Hitoshi Niigaki; Ryuichi Tanida

LiM-Loc: Visual Localization with Dense and Accurate 3D Reference Maps Directly Corresponding 2D Keypoints to 3D LiDAR Point Clouds

Masahiko Tsuji, Hitoshi Niigaki, Ryuichi Tanida

TL;DR

This work tackles visual localization by constructing dense, accurate $3$D reference maps and using direct 2D-3D matching to estimate the camera pose with $PnP$ in a $3$D map. LiM-Loc directly assigns 2D keypoints to $3$D LiDAR points, avoiding traditional feature matching and applying Hidden Point Removal with spherical shell compression to prune occlusions, while Reference Image Reduction reduces map-generation time. The approach yields a dense reference map, improves 2D-3D inliers, and demonstrates accuracy improvements across indoor and outdoor datasets for multiple local features. The results show errors of only a few centimeters and faster map generation, highlighting practical benefits for autonomous systems and robotics.

Abstract

Visual localization is to estimate the 6-DOF camera pose of a query image in a 3D reference map. We extract keypoints from the reference image and generate a 3D reference map with 3D reconstruction of the keypoints in advance. We emphasize that the more keypoints in the 3D reference map and the smaller the error of the 3D positions of the keypoints, the higher the accuracy of the camera pose estimation. However, previous image-only methods require a huge number of images, and it is difficult to 3D-reconstruct keypoints without error due to inevitable mismatches and failures in feature matching. As a result, the 3D reference map is sparse and inaccurate. In contrast, accurate 3D reference maps can be generated by combining images and 3D sensors. Recently, 3D-LiDAR has been widely used around the world. LiDAR, which measures a large space with high density, has become inexpensive. In addition, accurately calibrated cameras are also widely used, so images that record the external parameters of the camera without errors can be easily obtained. In this paper, we propose a method to directly assign 3D LiDAR point clouds to keypoints to generate dense and accurate 3D reference maps. The proposed method avoids feature matching and achieves accurate 3D reconstruction for almost all keypoints. To estimate camera pose over a wide area, we use the wide-area LiDAR point cloud to remove points that are not visible to the camera and reduce 2D-3D correspondence errors. Using indoor and outdoor datasets, we apply the proposed method to several state-of-the-art local features and confirm that it improves the accuracy of camera pose estimation.

LiM-Loc: Visual Localization with Dense and Accurate 3D Reference Maps Directly Corresponding 2D Keypoints to 3D LiDAR Point Clouds

TL;DR

Abstract

LiM-Loc: Visual Localization with Dense and Accurate 3D Reference Maps Directly Corresponding 2D Keypoints to 3D LiDAR Point Clouds

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)