Table of Contents
Fetching ...

Through the Perspective of LiDAR: A Feature-Enriched and Uncertainty-Aware Annotation Pipeline for Terrestrial Point Cloud Segmentation

Fei Zhang, Rob Chancia, Josie Clapp, Amirhossein Hassanzadeh, Dimah Dera, Richard MacKenzie, Jan van Aardt

TL;DR

The paper addresses the high labeling burden in TLS semantic segmentation by introducing a semi-automated, uncertainty-aware pipeline that uses spherical projection, multi-feature enrichment, and ensemble-based uncertainty to guide annotation. It presents Mangrove3D, a challenging mangrove TLS dataset, and demonstrates data-efficient labeling with an ensemble-uncertainty feedback loop that yields competitive $mIoU$ (≈0.76) with about 12 annotated scans. The approach generalizes across Boreal Forest (ForestSemantic) and Urban (Semantic3D) datasets, showing consistent gains from feature fusion, especially with surface normals, while maintaining a scalable, LiDAR-only processing pipeline. The visualization suite (2D feature maps, 3D colorized clouds, and compact virtual spheres) facilitates rapid triage, review, and cross-site comparison, enabling practical deployment for ecological monitoring and beyond.

Abstract

Accurate semantic segmentation of terrestrial laser scanning (TLS) point clouds is limited by costly manual annotation. We propose a semi-automated, uncertainty-aware pipeline that integrates spherical projection, feature enrichment, ensemble learning, and targeted annotation to reduce labeling effort, while sustaining high accuracy. Our approach projects 3D points to a 2D spherical grid, enriches pixels with multi-source features, and trains an ensemble of segmentation networks to produce pseudo-labels and uncertainty maps, the latter guiding annotation of ambiguous regions. The 2D outputs are back-projected to 3D, yielding densely annotated point clouds supported by a three-tier visualization suite (2D feature maps, 3D colorized point clouds, and compact virtual spheres) for rapid triage and reviewer guidance. Using this pipeline, we build Mangrove3D, a semantic segmentation TLS dataset for mangrove forests. We further evaluate data efficiency and feature importance to address two key questions: (1) how much annotated data are needed and (2) which features matter most. Results show that performance saturates after ~12 annotated scans, geometric features contribute the most, and compact nine-channel stacks capture nearly all discriminative power, with the mean Intersection over Union (mIoU) plateauing at around 0.76. Finally, we confirm the generalization of our feature-enrichment strategy through cross-dataset tests on ForestSemantic and Semantic3D. Our contributions include: (i) a robust, uncertainty-aware TLS annotation pipeline with visualization tools; (ii) the Mangrove3D dataset; and (iii) empirical guidance on data efficiency and feature importance, thus enabling scalable, high-quality segmentation of TLS point clouds for ecological monitoring and beyond. The dataset and processing scripts are publicly available at https://fz-rit.github.io/through-the-lidars-eye/.

Through the Perspective of LiDAR: A Feature-Enriched and Uncertainty-Aware Annotation Pipeline for Terrestrial Point Cloud Segmentation

TL;DR

The paper addresses the high labeling burden in TLS semantic segmentation by introducing a semi-automated, uncertainty-aware pipeline that uses spherical projection, multi-feature enrichment, and ensemble-based uncertainty to guide annotation. It presents Mangrove3D, a challenging mangrove TLS dataset, and demonstrates data-efficient labeling with an ensemble-uncertainty feedback loop that yields competitive (≈0.76) with about 12 annotated scans. The approach generalizes across Boreal Forest (ForestSemantic) and Urban (Semantic3D) datasets, showing consistent gains from feature fusion, especially with surface normals, while maintaining a scalable, LiDAR-only processing pipeline. The visualization suite (2D feature maps, 3D colorized clouds, and compact virtual spheres) facilitates rapid triage, review, and cross-site comparison, enabling practical deployment for ecological monitoring and beyond.

Abstract

Accurate semantic segmentation of terrestrial laser scanning (TLS) point clouds is limited by costly manual annotation. We propose a semi-automated, uncertainty-aware pipeline that integrates spherical projection, feature enrichment, ensemble learning, and targeted annotation to reduce labeling effort, while sustaining high accuracy. Our approach projects 3D points to a 2D spherical grid, enriches pixels with multi-source features, and trains an ensemble of segmentation networks to produce pseudo-labels and uncertainty maps, the latter guiding annotation of ambiguous regions. The 2D outputs are back-projected to 3D, yielding densely annotated point clouds supported by a three-tier visualization suite (2D feature maps, 3D colorized point clouds, and compact virtual spheres) for rapid triage and reviewer guidance. Using this pipeline, we build Mangrove3D, a semantic segmentation TLS dataset for mangrove forests. We further evaluate data efficiency and feature importance to address two key questions: (1) how much annotated data are needed and (2) which features matter most. Results show that performance saturates after ~12 annotated scans, geometric features contribute the most, and compact nine-channel stacks capture nearly all discriminative power, with the mean Intersection over Union (mIoU) plateauing at around 0.76. Finally, we confirm the generalization of our feature-enrichment strategy through cross-dataset tests on ForestSemantic and Semantic3D. Our contributions include: (i) a robust, uncertainty-aware TLS annotation pipeline with visualization tools; (ii) the Mangrove3D dataset; and (iii) empirical guidance on data efficiency and feature importance, thus enabling scalable, high-quality segmentation of TLS point clouds for ecological monitoring and beyond. The dataset and processing scripts are publicly available at https://fz-rit.github.io/through-the-lidars-eye/.

Paper Structure

This paper contains 55 sections, 17 equations, 27 figures, 7 tables.

Figures (27)

  • Figure 2: Class counts for the Mangrove3D dataset.
  • Figure 3: Three-stage workflow for annotating terrestrial-LiDAR scans. Stage 1: Spherical projection converts raw TLS points into two-dimensional feature maps and pseudo-RGB images. Stage 2: An iterative loop combines active learning and self-training: an emsemble segmentation model is repeatedly refined using uncertainty-guided queries and high-confidence pseudo-labels. Stage 3: The resulting 2-D segmentation masks are back-projected, followed by label refinement in 3D space and then reproject back to 2D, to yield a fully annotated point cloud and refined 2D segmentation mask.
  • Figure 4: Spherical projection workflow illustrated with a scan of the CBL LiDAR. (a) Original 3D point cloud visualized by intensity with plasma color scale. (b) Geometric illustration of the spherical projection. (c) Spherical projection map of raw intensity values.
  • Figure 5: Point‐density validation of the spherical projection map. When the map resolution matches the TLS angular resolution, most pixels contain exactly one point; a value of 0 indicates no return from that angle, 2 indicates a dual‐return measurement, and values above 2 occur rarely, typically due to systematic noise.
  • Figure 6: Examples of (a) a pseudo-label map and (b) epistemic uncertainty calculated from mutual information.
  • ...and 22 more figures