Table of Contents
Fetching ...

Diffusion Based Robust LiDAR Place Recognition

Benjamin Krummenacher, Jonas Frey, Turcan Tuna, Olga Vysotska, Marco Hutter

TL;DR

The paper tackles robust global LiDAR-based place recognition for construction sites, addressing perceptual aliasing and the kidnapped robot problem by learning a multi-hypothesis pose distribution from synthetic LiDAR data generated inside an accurate mesh. It introduces a diffusion regression model with a PointNet++ backbone that outputs $N$ candidate poses from a single scan, followed by fast global registration (FGR) to verify and refine the best match. Key contributions include the diffusion-based place recognition module, synthetic dataset generation from reality capture meshes, thorough evaluation on three real-world datasets, and ablation/localizability analyses. The results demonstrate competitive accuracy and the ability to represent multi-modal pose distributions, enabling robust re-localization in complex, multi-floor construction environments and offering practical benefits for downstream global registration tasks.

Abstract

Mobile robots on construction sites require accurate pose estimation to perform autonomous surveying and inspection missions. Localization in construction sites is a particularly challenging problem due to the presence of repetitive features such as flat plastered walls and perceptual aliasing due to apartments with similar layouts inter and intra floors. In this paper, we focus on the global re-positioning of a robot with respect to an accurate scanned mesh of the building solely using LiDAR data. In our approach, a neural network is trained on synthetic LiDAR point clouds generated by simulating a LiDAR in an accurate real-life large-scale mesh. We train a diffusion model with a PointNet++ backbone, which allows us to model multiple position candidates from a single LiDAR point cloud. The resulting model can successfully predict the global position of LiDAR in confined and complex sites despite the adverse effects of perceptual aliasing. The learned distribution of potential global positions can provide multi-modal position distribution. We evaluate our approach across five real-world datasets and show the place recognition accuracy of 77% +/-2m on average while outperforming baselines at a factor of 2 in mean error.

Diffusion Based Robust LiDAR Place Recognition

TL;DR

The paper tackles robust global LiDAR-based place recognition for construction sites, addressing perceptual aliasing and the kidnapped robot problem by learning a multi-hypothesis pose distribution from synthetic LiDAR data generated inside an accurate mesh. It introduces a diffusion regression model with a PointNet++ backbone that outputs candidate poses from a single scan, followed by fast global registration (FGR) to verify and refine the best match. Key contributions include the diffusion-based place recognition module, synthetic dataset generation from reality capture meshes, thorough evaluation on three real-world datasets, and ablation/localizability analyses. The results demonstrate competitive accuracy and the ability to represent multi-modal pose distributions, enabling robust re-localization in complex, multi-floor construction environments and offering practical benefits for downstream global registration tasks.

Abstract

Mobile robots on construction sites require accurate pose estimation to perform autonomous surveying and inspection missions. Localization in construction sites is a particularly challenging problem due to the presence of repetitive features such as flat plastered walls and perceptual aliasing due to apartments with similar layouts inter and intra floors. In this paper, we focus on the global re-positioning of a robot with respect to an accurate scanned mesh of the building solely using LiDAR data. In our approach, a neural network is trained on synthetic LiDAR point clouds generated by simulating a LiDAR in an accurate real-life large-scale mesh. We train a diffusion model with a PointNet++ backbone, which allows us to model multiple position candidates from a single LiDAR point cloud. The resulting model can successfully predict the global position of LiDAR in confined and complex sites despite the adverse effects of perceptual aliasing. The learned distribution of potential global positions can provide multi-modal position distribution. We evaluate our approach across five real-world datasets and show the place recognition accuracy of 77% +/-2m on average while outperforming baselines at a factor of 2 in mean error.

Paper Structure

This paper contains 17 sections, 4 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: The proposed diffusion based LiDAR place recognition pipeline learns to predict potential positions in a given environment.
  • Figure 2: Overview of the proposed place recognition module. During training, we sample random poses inside the mesh and ray cast the LiDAR pattern into the mesh. Features $\mathbf{c}$ are extracted from the resulting point clouds and provided as input to our diffusion model. During inference, a real LiDAR point cloud is passed into our model, which gradually denoises multiple random positions to $N$ candidate positions $\mathbf{\hat{x}}^{1..N}$. We use a diffusion model to generate multiple different candidate positions from one input point cloud. Fast global registration is used to select the best fitting candidate and refine the position estimate.
  • Figure 3: The datasets used in the evaluation are shown.
  • Figure 4: Trajectories of the different datasets are shown. The colors correspond to the error between the predicted and the ground truth positions. The grey area represents all the points sampled during training.
  • Figure 5: Distribution of candidate points. The best guess is found by performing fast global registration FastGlobalRegistration.
  • ...and 3 more figures