Table of Contents
Fetching ...

MinkUNeXt-SI: Improving point cloud-based place recognition including spherical coordinates and LiDAR intensity

Judith Vilella-Cantos, Juan José Cabrera, Luis Payá, Mónica Ballesta, David Valiente

TL;DR

This work tackles robust place recognition for autonomous navigation by enriching LiDAR-based descriptors with spherical coordinates and normalized intensity, then learning with a Minkowski U‑Net backbone (MinkUNeXt-SI). The method preprocesses input by transforming Cartesian to spherical space and applying histogram equalization to intensity, producing a compact, highly discriminative descriptor that generalizes across diverse datasets and sensors. Results show state-of-the-art or near‑state-of-the-art performance on multiple benchmarks (Oxford, USyd, KITTI, NCLT, ARVC), with strong cross-domain generalization and real-time inference (sub-25 ms per scan). The authors provide public code and datasets to ensure reproducibility and suggest future work on fusing LiDAR data from multiple sources to further improve cross-environment robustness.

Abstract

In autonomous navigation systems, the solution of the place recognition problem is crucial for their safe functioning. But this is not a trivial solution, since it must be accurate regardless of any changes in the scene, such as seasonal changes and different weather conditions, and it must be generalizable to other environments. This paper presents our method, MinkUNeXt-SI, which, starting from a LiDAR point cloud, preprocesses the input data to obtain its spherical coordinates and intensity values normalized within a range of 0 to 1 for each point, and it produces a robust place recognition descriptor. To that end, a deep learning approach that combines Minkowski convolutions and a U-net architecture with skip connections is used. The results of MinkUNeXt-SI demonstrate that this method reaches and surpasses state-of-the-art performance while it also generalizes satisfactorily to other datasets. Additionally, we showcase the capture of a custom dataset and its use in evaluating our solution, which also achieves outstanding results. Both the code of our solution and the runs of our dataset are publicly available for reproducibility purposes.

MinkUNeXt-SI: Improving point cloud-based place recognition including spherical coordinates and LiDAR intensity

TL;DR

This work tackles robust place recognition for autonomous navigation by enriching LiDAR-based descriptors with spherical coordinates and normalized intensity, then learning with a Minkowski U‑Net backbone (MinkUNeXt-SI). The method preprocesses input by transforming Cartesian to spherical space and applying histogram equalization to intensity, producing a compact, highly discriminative descriptor that generalizes across diverse datasets and sensors. Results show state-of-the-art or near‑state-of-the-art performance on multiple benchmarks (Oxford, USyd, KITTI, NCLT, ARVC), with strong cross-domain generalization and real-time inference (sub-25 ms per scan). The authors provide public code and datasets to ensure reproducibility and suggest future work on fusing LiDAR data from multiple sources to further improve cross-environment robustness.

Abstract

In autonomous navigation systems, the solution of the place recognition problem is crucial for their safe functioning. But this is not a trivial solution, since it must be accurate regardless of any changes in the scene, such as seasonal changes and different weather conditions, and it must be generalizable to other environments. This paper presents our method, MinkUNeXt-SI, which, starting from a LiDAR point cloud, preprocesses the input data to obtain its spherical coordinates and intensity values normalized within a range of 0 to 1 for each point, and it produces a robust place recognition descriptor. To that end, a deep learning approach that combines Minkowski convolutions and a U-net architecture with skip connections is used. The results of MinkUNeXt-SI demonstrate that this method reaches and surpasses state-of-the-art performance while it also generalizes satisfactorily to other datasets. Additionally, we showcase the capture of a custom dataset and its use in evaluating our solution, which also achieves outstanding results. Both the code of our solution and the runs of our dataset are publicly available for reproducibility purposes.

Paper Structure

This paper contains 23 sections, 5 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: MinkUNeXt-SI's workflow. First, the input point cloud is processed from Cartesian to spherical coordinates, plus applying histogram equalization to the intensity channel. This processed information then feeds our MinKUNeXt-SI, resulting in a robust descriptor of the environment for place recognition.
  • Figure 2: Representation of a sample point in the Cartesian space $P = (2, 2, 2)$ with its spherical conversion resulting in $P = (3.46, 45, 54.7)$.
  • Figure 3: Comparison between the projected image from the same point cloud colored with the intensity values: (a) not normalized; (b) normalized; (c) original point cloud.
  • Figure 4: Some of the characteristics of the ARVC dataset. (a) Robotic setup, the robotic platform is a Husky UGV and the mounted LiDAR sensor model is an Ouster OS1-128 LiDAR. (b) Map of a run on the Miguel Hernández University campus, constructed from LiDAR scans.
  • Figure 5: Comparison between the same location captured by our ARVC dataset in different seasons: (a) winter; (b) spring.
  • ...and 4 more figures