Towards classification-based representation learning for place recognition on LiDAR scans
Maksim Konoplia, Dmitrii Khizbullin
TL;DR
The paper investigates a classification-based formulation for LiDAR-based place recognition by discretizing spatial locations across multiple NuScenes maps and training an encoder–decoder to predict location classes. A Masked Cross-Entropy loss is used to stabilize training, and predictions are evaluated via KNN search over a pre-indexed database of embeddings, enabling efficient retrieval. While results are competitive with some baselines, they lag behind contrastive-learning models, highlighting trade-offs between training stability, efficiency, and accuracy. The work also discusses data-splitting strategies, out-of-domain evaluation, and scalability considerations, underscoring the potential and challenges of classification-based localization for large-scale deployment.
Abstract
Place recognition is a crucial task in autonomous driving, allowing vehicles to determine their position using sensor data. While most existing methods rely on contrastive learning, we explore an alternative approach by framing place recognition as a multi-class classification problem. Our method assigns discrete location labels to LiDAR scans and trains an encoder-decoder model to classify each scan's position directly. We evaluate this approach on the NuScenes dataset and show that it achieves competitive performance compared to contrastive learning-based methods while offering advantages in training efficiency and stability.
