FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated Training

Raktim Gautam Goswami; Naman Patel; Prashanth Krishnamurthy; Farshad Khorrami

FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated Training

Raktim Gautam Goswami, Naman Patel, Prashanth Krishnamurthy, Farshad Khorrami

TL;DR

This work proposes FlashMix, which uses a frozen, scene-agnostic backbone to extract local point descriptors, aggregated with an MLP mixer to predict sensor pose, and demonstrates its effectiveness for rapid and accurate LiDAR localization in real-world scenarios.

Abstract

Map-free LiDAR localization systems accurately localize within known environments by predicting sensor position and orientation directly from raw point clouds, eliminating the need for large maps and descriptors. However, their long training times hinder rapid adaptation to new environments. To address this, we propose FlashMix, which uses a frozen, scene-agnostic backbone to extract local point descriptors, aggregated with an MLP mixer to predict sensor pose. A buffer of local descriptors is used to accelerate training by orders of magnitude, combined with metric learning or contrastive loss regularization of aggregated descriptors to improve performance and convergence. We evaluate FlashMix on various LiDAR localization benchmarks, examining different regularizations and aggregators, demonstrating its effectiveness for rapid and accurate LiDAR localization in real-world scenarios. The code is available at https://github.com/raktimgg/FlashMix.

FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated Training

TL;DR

Abstract

Paper Structure (23 sections, 9 equations, 5 figures, 12 tables)

This paper contains 23 sections, 9 equations, 5 figures, 12 tables.

Introduction
Related Works
Methodology
Problem Statement
Scene-agnostic Backbone
Scene-specific Regressor
Descriptor Aggregator
Pose Predictor
Training Objective
Pose Loss
Contrastive Regularization
FlashMix Training
Experiments
Dataset and Implementation Details
Results
...and 8 more sections

Figures (5)

Figure 1: Comparision of LiDAR pose regression-based framework (top) with our fast map-free LiDAR localization system.
Figure 2: FlashMix framework: A scene-agnostic backbone extracts local descriptors from farthest point sampled point clouds to store in a training buffer. An MLP Mixer and global average pooled aggregate descriptor predicts pose from trained pose and contrastive loss.
Figure 3: MLP-Mixer Aggregator that fuses local descriptor using point and channel mixing MLPs followed by average pooling.
Figure 4: Analysis of relocalization rate as a function of train time.
Figure 5: Visualization of different methods on test trajectories from Oxford-Radar, DCC, and vReLoC dataset. Trajectory visualization: The ground truth and estimated positions are shown in dark blue and red dots, respectively. The star shows the starting position.

FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated Training

TL;DR

Abstract

FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated Training

Authors

TL;DR

Abstract

Table of Contents

Figures (5)