Table of Contents
Fetching ...

LightLoc: Learning Outdoor LiDAR Localization at Light Speed

Wen Li, Chen Liu, Shangshu Yu, Dunqiang Liu, Yin Zhou, Siqi Shen, Chenglu Wen, Cheng Wang

TL;DR

LightLoc addresses the prohibitive training time of regression-based outdoor LiDAR localization by freezing a scene-agnostic backbone and training only scene-specific heads, augmented with sample classification guidance (SCG) and redundant sample downsampling (RSD). SCG provides fast, 5-minute scene labeling to guide regression learning, while RSD prunes well-learned samples to maintain speed without sacrificing accuracy. The approach delivers state-of-the-art or competitive localization accuracy with about 50× faster training on large-scale outdoor datasets and enables SLAM-level error correction through confidence-guided measurements. Practically, LightLoc enables near real-time adaptation to new environments, making it highly suitable for autonomous driving, drones, and robotics deployments that require rapid model updates.

Abstract

Scene coordinate regression achieves impressive results in outdoor LiDAR localization but requires days of training. Since training needs to be repeated for each new scene, long training times make these methods impractical for time-sensitive applications, such as autonomous driving, drones, and robotics. We identify large coverage areas and vast data in large-scale outdoor scenes as key challenges that limit fast training. In this paper, we propose LightLoc, the first method capable of efficiently learning localization in a new scene at light speed. LightLoc introduces two novel techniques to address these challenges. First, we introduce sample classification guidance to assist regression learning, reducing ambiguity from similar samples and improving training efficiency. Second, we propose redundant sample downsampling to remove well-learned frames during training, reducing training time without compromising accuracy. Additionally, the fast training and confidence estimation capabilities of sample classification enable its integration into SLAM, effectively eliminating error accumulation. Extensive experiments on large-scale outdoor datasets demonstrate that LightLoc achieves state-of-the-art performance with a 50x reduction in training time than existing methods. Our code is available at https://github.com/liw95/LightLoc.

LightLoc: Learning Outdoor LiDAR Localization at Light Speed

TL;DR

LightLoc addresses the prohibitive training time of regression-based outdoor LiDAR localization by freezing a scene-agnostic backbone and training only scene-specific heads, augmented with sample classification guidance (SCG) and redundant sample downsampling (RSD). SCG provides fast, 5-minute scene labeling to guide regression learning, while RSD prunes well-learned samples to maintain speed without sacrificing accuracy. The approach delivers state-of-the-art or competitive localization accuracy with about 50× faster training on large-scale outdoor datasets and enables SLAM-level error correction through confidence-guided measurements. Practically, LightLoc enables near real-time adaptation to new environments, making it highly suitable for autonomous driving, drones, and robotics deployments that require rapid model updates.

Abstract

Scene coordinate regression achieves impressive results in outdoor LiDAR localization but requires days of training. Since training needs to be repeated for each new scene, long training times make these methods impractical for time-sensitive applications, such as autonomous driving, drones, and robotics. We identify large coverage areas and vast data in large-scale outdoor scenes as key challenges that limit fast training. In this paper, we propose LightLoc, the first method capable of efficiently learning localization in a new scene at light speed. LightLoc introduces two novel techniques to address these challenges. First, we introduce sample classification guidance to assist regression learning, reducing ambiguity from similar samples and improving training efficiency. Second, we propose redundant sample downsampling to remove well-learned frames during training, reducing training time without compromising accuracy. Additionally, the fast training and confidence estimation capabilities of sample classification enable its integration into SLAM, effectively eliminating error accumulation. Extensive experiments on large-scale outdoor datasets demonstrate that LightLoc achieves state-of-the-art performance with a 50x reduction in training time than existing methods. Our code is available at https://github.com/liw95/LightLoc.

Paper Structure

This paper contains 23 sections, 5 equations, 11 figures, 10 tables, 1 algorithm.

Figures (11)

  • Figure 1: LiDAR localization performance vs. training time. The figure shows the mean position error and training time of several regression-based methods on the QEOxford Dan_2020_ICRALi_2023_CVPR dataset. The proposed LightLoc achieves state-of-the-art performance with significantly reduced training time.
  • Figure 2: Illustration of the training pipeline for LightLoc. (a) The backbone is trained with $N$ regression heads for $N$ scenes in parallel to produce a scene-agnostic feature backbone. (b) In new scenes, the backbone parameters are frozen, and only the scene-specific prediction heads are trained. We propose a sample classification guidance (SCG) and a redundant sample downsampling (RSD) technique to accelerate training. SCG is established by training an MLP head, with the resulting sample probability distribution feature to help SCR learning. RSD is incorporated into the training loop to filter out well-learned samples, enabling high-speed training. RPC and WPC denote the point cloud in raw and world coordinate frames, respectively. SC means sample classification.
  • Figure 3: Illustration of results with error accumulation eliminated. (a) Relationship between position error and confidence estimation. (b) Visualization results of LOAM and ours. The star denotes the first frame.
  • Figure 4: Architecture of the scene-agnostic feature backbone. The parameter count is about 16M.
  • Figure 5: Illustration of the multi-scene division in nuScenes. The scene indices and the number of samples are reported.
  • ...and 6 more figures