Table of Contents
Fetching ...

Speak the Same Language: Global LiDAR Registration on BIM Using Pose Hough Transform

Zhijian Qiao, Haoming Huang, Chuhao Liu, Zehuan Yu, Shaojie Shen, Fumin Zhang, Huan Yin

TL;DR

The paper addresses the problem of aligning LiDAR scans to BIM models in a shared reference frame, enabling cross-modal, global registration without external infrastructure. It introduces a front-end that extracts walls and corners to form triangle descriptors with $O(1)$ retrieval, and a back-end that uses a Pose Hough Transform on $SE(2)$ with hierarchical voting to generate multiple pose candidates. The optimal transformation is selected via an occupancy-aware verification score that compare BIM and LiDAR occupancy, improving robustness to as-built versus as-designed deviations. Real-world experiments in a large university building with two LiDAR sensors demonstrate the method's accuracy and efficiency, and the authors provide open-source code and expanded datasets to facilitate further research.

Abstract

Light detection and ranging (LiDAR) point clouds and building information modeling (BIM) represent two distinct data modalities in the fields of robot perception and construction. These modalities originate from different sources and are associated with unique reference frames. The primary goal of this study is to align these modalities within a shared reference frame using a global registration approach, effectively enabling them to ``speak the same language''. To achieve this, we propose a cross-modality registration method, spanning from the front end to the back end. At the front end, we extract triangle descriptors by identifying walls and intersected corners, enabling the matching of corner triplets with a complexity independent of the BIM's size. For the back-end transformation estimation, we utilize the Hough transform to map the matched triplets to the transformation space and introduce a hierarchical voting mechanism to hypothesize multiple pose candidates. The final transformation is then verified using our designed occupancy-aware scoring method. To assess the effectiveness of our approach, we conducted real-world multi-session experiments in a large-scale university building, employing two different types of LiDAR sensors. We make the collected datasets and codes publicly available to benefit the community.

Speak the Same Language: Global LiDAR Registration on BIM Using Pose Hough Transform

TL;DR

The paper addresses the problem of aligning LiDAR scans to BIM models in a shared reference frame, enabling cross-modal, global registration without external infrastructure. It introduces a front-end that extracts walls and corners to form triangle descriptors with retrieval, and a back-end that uses a Pose Hough Transform on with hierarchical voting to generate multiple pose candidates. The optimal transformation is selected via an occupancy-aware verification score that compare BIM and LiDAR occupancy, improving robustness to as-built versus as-designed deviations. Real-world experiments in a large university building with two LiDAR sensors demonstrate the method's accuracy and efficiency, and the authors provide open-source code and expanded datasets to facilitate further research.

Abstract

Light detection and ranging (LiDAR) point clouds and building information modeling (BIM) represent two distinct data modalities in the fields of robot perception and construction. These modalities originate from different sources and are associated with unique reference frames. The primary goal of this study is to align these modalities within a shared reference frame using a global registration approach, effectively enabling them to ``speak the same language''. To achieve this, we propose a cross-modality registration method, spanning from the front end to the back end. At the front end, we extract triangle descriptors by identifying walls and intersected corners, enabling the matching of corner triplets with a complexity independent of the BIM's size. For the back-end transformation estimation, we utilize the Hough transform to map the matched triplets to the transformation space and introduce a hierarchical voting mechanism to hypothesize multiple pose candidates. The final transformation is then verified using our designed occupancy-aware scoring method. To assess the effectiveness of our approach, we conducted real-world multi-session experiments in a large-scale university building, employing two different types of LiDAR sensors. We make the collected datasets and codes publicly available to benefit the community.
Paper Structure (37 sections, 8 equations, 10 figures, 4 tables)

This paper contains 37 sections, 8 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: The pipeline of our proposed method begins with the accumulation of LiDAR scans to build submaps using LiDAR odometry (Section \ref{['sec:submap']}). From these submaps, wall, corner, and ground features are extracted (Section \ref{['sec:feature']}) to construct triplets and generate triangle descriptors (Section \ref{['sec:descriptor']}). For BIM preprocessing, specialized software tools are employed to extract wall and corner features (Section \ref{['sec:bim_de']}). The triangle descriptors are efficiently matched using a hash algorithm, enabling one-shot registration across the entire BIM. Based on the matched triplets, transformation candidates are generated, and a pose Hough transform combined with a hierarchical voting procedure is applied to identify the most plausible transformation candidates (Section \ref{['sec:hough']}). Finally, an occupancy-aware confidence score is introduced to determine the optimal transformation (Section \ref{['sec:verify']}).
  • Figure 2: The workflow for extracting triangle descriptors from LiDAR submaps begins with a dense LiDAR submap. The process starts with plane segmentation to identify 3D wall structures. These segmented walls are projected onto the 2D ground plane, where lines are detected. Corner points are then extracted from the intersections of adjacent lines. Finally, triangle descriptors are computed based on the corner triplets within the submap.
  • Figure 3: Triangular descriptor formulation, where each vertex in the corner triplet, denoted as $A$, $B$, and $C$, corresponds to the wall corners. The angles $\alpha$, $\beta$, and $\gamma$ represent the angles between the triangle sides and the local wall segments.
  • Figure 4: The figure demonstrates the effectiveness of triangle descriptors in matching results, with and without angle inclusion. On the left, a highlighted corner triplet from the LiDAR submap is shown in orange, which has multiple matches from the BIM using descriptor-based matching. Incorporating angle information significantly enhances the discriminative power of the triangle descriptor by encoding the local geometric structure around the triangle, which markedly reduces the false positives.
  • Figure 5: The proposed hierarchical voting method consists of three steps: Broad Filtering, Neighborhood Refinement, and NMS with Ranking. Each step gradually reduces the number of transformation candidates, balancing computational efficiency and robustness.
  • ...and 5 more figures