Table of Contents
Fetching ...

Industrial3D: A Terrestrial LiDAR Point Cloud Dataset and CrossParadigm Benchmark for Industrial Infrastructure

Chao Yin, Hongzhe Yue, Qing Han, Difeng Hu, Zhenyu Liang, Fangzhou Lin, Bing Sun, Boyu Wang, Mingkai Li, Wei Yao, Jack C. P. Cheng

Abstract

Automated semantic understanding of dense point clouds is a prerequisite for Scan-to-BIM pipelines, digital twin construction, and as-built verification--core tasks in the digital transformation of the construction industry. Yet for industrial mechanical, electrical, and plumbing (MEP) facilities, this challenge remains largely unsolved: TLS acquisitions of water treatment plants, chiller halls, and pumping stations exhibit extreme geometric ambiguity, severe occlusion, and extreme class imbalance that architectural benchmarks (e.g., S3DIS or ScanNet) cannot adequately represent. We present Industrial3D, a terrestrial LiDAR dataset comprising 612 million expertly labelled points at 6 mm resolution from 13 water treatment facilities. At 6.6x the scale of the closest comparable MEP dataset, Industrial3D provides the largest and most demanding testbed for industrial 3D scene understanding to date. We further establish the first industrial cross-paradigm benchmark, evaluating nine representative methods across fully supervised, weakly supervised, unsupervised, and foundation model settings under a unified benchmark protocol. The best supervised method achieves 55.74% mIoU, whereas zero-shot Point-SAM reaches only 15.79%--a 39.95 percentage-point gap that quantifies the unresolved domain-transfer challenge for industrial TLS data. Systematic analysis reveals that this gap originates from a dual crisis: statistical rarity (215:1 imbalance, 3.5x more severe than S3DIS) and geometric ambiguity (tail-class points share cylindrical primitives with head-class pipes) that frequency-based re-weighting alone cannot resolve. Industrial3D, along with benchmark code and pre-trained models, will be publicly available at https://github.com/pointcloudyc/Industrial3D.

Industrial3D: A Terrestrial LiDAR Point Cloud Dataset and CrossParadigm Benchmark for Industrial Infrastructure

Abstract

Automated semantic understanding of dense point clouds is a prerequisite for Scan-to-BIM pipelines, digital twin construction, and as-built verification--core tasks in the digital transformation of the construction industry. Yet for industrial mechanical, electrical, and plumbing (MEP) facilities, this challenge remains largely unsolved: TLS acquisitions of water treatment plants, chiller halls, and pumping stations exhibit extreme geometric ambiguity, severe occlusion, and extreme class imbalance that architectural benchmarks (e.g., S3DIS or ScanNet) cannot adequately represent. We present Industrial3D, a terrestrial LiDAR dataset comprising 612 million expertly labelled points at 6 mm resolution from 13 water treatment facilities. At 6.6x the scale of the closest comparable MEP dataset, Industrial3D provides the largest and most demanding testbed for industrial 3D scene understanding to date. We further establish the first industrial cross-paradigm benchmark, evaluating nine representative methods across fully supervised, weakly supervised, unsupervised, and foundation model settings under a unified benchmark protocol. The best supervised method achieves 55.74% mIoU, whereas zero-shot Point-SAM reaches only 15.79%--a 39.95 percentage-point gap that quantifies the unresolved domain-transfer challenge for industrial TLS data. Systematic analysis reveals that this gap originates from a dual crisis: statistical rarity (215:1 imbalance, 3.5x more severe than S3DIS) and geometric ambiguity (tail-class points share cylindrical primitives with head-class pipes) that frequency-based re-weighting alone cannot resolve. Industrial3D, along with benchmark code and pre-trained models, will be publicly available at https://github.com/pointcloudyc/Industrial3D.

Paper Structure

This paper contains 30 sections, 7 equations, 9 figures, 13 tables.

Figures (9)

  • Figure 1: Overview of the Industrial3D dataset from representative water treatment facilities. (a) UAV image of the facility. (b) On-site photographs from five representative rooms (Areas 1, 2, 9, 12, and 13). (c) Manually annotated point clouds for three representative scenes showing 12-class MEP labels; structural elements and non-MEP clutter are omitted for clarity.
  • Figure 2: Data acquisition equipment for Industrial3D. The Leica BLK360 terrestrial laser scanner is shown in detail (inset) and deployed in an industrial scene (WRT2), where an iPad Pro controller facilitates real-time preview and registration.
  • Figure 3: Representative instances of each semantic class in Industrial3D, rendered as synthetic point clouds from BIM models for clarity. Actual TLS scans exhibit significant occlusion, noise, and incomplete coverage. Pipe-related classes (pipe, elbow, tee, reducer) share geometric similarity that creates segmentation challenges.
  • Figure 4: Four-step annotation pipeline used to label the Industrial3D dataset (612M points): tiling into manageable segments, per-tile cluster extraction in CloudCompare, scene-wide merging, and boundary refinement with outlier removal.
  • Figure 5: Class distribution and train-test split in Industrial3D. (A) Point counts per semantic class (logarithmic scale, units in millions). Classes group into three tiers: Head (77%), Common (20%), Tail (3%). (B) Distribution across training (86.1%) and test (13.9%) sets.
  • ...and 4 more figures