Polar Perspectives: Evaluating 2-D LiDAR Projections for Robust Place Recognition with Visual Foundation Models
Pierpaolo Serio, Giulio Pisaneschi, Andrea Dan Ryals, Vincenzo Infantino, Lorenzo Gentilini, Valentina Donzella, Lorenzo Pollini
TL;DR
This paper systematically compares 2-D LiDAR-to-image projections for place recognition using a state-of-the-art vision foundation model, isolating the effect of projection while keeping the backbone and aggregation fixed. It introduces a modular pipeline built around DINOv3 that evaluates BEV, polar, range image, and front-view representations with consistent channels, across KITTI, NCLT, HELILPR, and a challenging warehouse dataset. The results show that polar projection provides the most robust and discriminative descriptors across outdoor domains, with BEV as a strong secondary option, while range and front views lag in outdoor scenarios though can be useful in dense indoor environments. The work offers practical guidance for projection selection, demonstrates how lightweight projection-head tuning interacts with representation choice, and suggests directions for improvement such as projection-head co-design and ensemble approaches to further enhance real-time LiDAR-based place recognition.
Abstract
This work presents a systematic investigation into how alternative LiDAR-to-image projections affect metric place recognition when coupled with a state-of-the-art vision foundation model. We introduce a modular retrieval pipeline that controls for backbone, aggregation, and evaluation protocol, thereby isolating the influence of the 2-D projection itself. Using consistent geometric and structural channels across multiple datasets and deployment scenarios, we identify the projection characteristics that most strongly determine discriminative power, robustness to environmental variation, and suitability for real-time autonomy. Experiments with different datasets, including integration into an operational place recognition policy, validate the practical relevance of these findings and demonstrate that carefully designed projections can serve as an effective surrogate for end-to-end 3-D learning in LiDAR place recognition.
