Table of Contents
Fetching ...

ISC-Perception: A Hybrid Computer Vision Dataset for Object Detection in Novel Steel Assembly

Miftahur Rahman, Samuel Adebayo, Dorian A. Acevedo-Mejia, David Hester, Daniel McPolin, Karen Rafferty, Debra F. Laefer

TL;DR

ISC-Perception introduces a task-specific hybrid dataset for detecting Intermeshed Steel Connection (ISC) components in robotic steel erection. By merging photorealistic renders from SolidWorks Visualize, synthetic Unity scenes with custom randomizers, and a curated set of real images, the approach reduces manual labeling effort while improving generalization across diverse site conditions. Empirical evaluation shows the hybrid dataset yields the best detection performance (e.g., mAP@0.5-0.95 up to 0.664), outperforming synthetic-only and photorealistic-only baselines and achieving strong bench-test results (mAP@0.5 ≈ 0.943). The work demonstrates a scalable pathway to accelerate perception model development for construction robotics and offers the dataset to researchers and industry upon request.

Abstract

The Intermeshed Steel Connection (ISC) system, when paired with robotic manipulators, can accelerate steel-frame assembly and improve worker safety by eliminating manual assembly. Dependable perception is one of the initial stages for ISC-aware robots. However, this is hampered by the absence of a dedicated image corpus, as collecting photographs on active construction sites is logistically difficult and raises safety and privacy concerns. In response, we introduce ISC-Perception, the first hybrid dataset expressly designed for ISC component detection. It blends procedurally rendered CAD images, game-engine photorealistic scenes, and a limited, curated set of real photographs, enabling fully automatic labelling of the synthetic portion. We explicitly account for all human effort to produce the dataset, including simulation engine and scene setup, asset preparation, post-processing scripts and quality checks; our total human time to generate a 10,000-image dataset was 30.5,h versus 166.7,h for manual labelling at 60,s per image (-81.7%). A manual pilot on a representative image with five instances of ISC members took 60,s (maximum 80,s), anchoring the manual baseline. Detectors trained on ISC-Perception achieved a mean Average Precision at IoU 0.50 of 0.756, substantially surpassing models trained on synthetic-only or photorealistic-only data. On a 1,200-frame bench test, we report mAP@0.50/mAP@[0.50:0.95] of 0.943/0.823. By bridging the data gap for construction-robotics perception, ISC-Perception facilitates rapid development of custom object detectors and is freely available for research and industrial use upon request.

ISC-Perception: A Hybrid Computer Vision Dataset for Object Detection in Novel Steel Assembly

TL;DR

ISC-Perception introduces a task-specific hybrid dataset for detecting Intermeshed Steel Connection (ISC) components in robotic steel erection. By merging photorealistic renders from SolidWorks Visualize, synthetic Unity scenes with custom randomizers, and a curated set of real images, the approach reduces manual labeling effort while improving generalization across diverse site conditions. Empirical evaluation shows the hybrid dataset yields the best detection performance (e.g., mAP@0.5-0.95 up to 0.664), outperforming synthetic-only and photorealistic-only baselines and achieving strong bench-test results (mAP@0.5 ≈ 0.943). The work demonstrates a scalable pathway to accelerate perception model development for construction robotics and offers the dataset to researchers and industry upon request.

Abstract

The Intermeshed Steel Connection (ISC) system, when paired with robotic manipulators, can accelerate steel-frame assembly and improve worker safety by eliminating manual assembly. Dependable perception is one of the initial stages for ISC-aware robots. However, this is hampered by the absence of a dedicated image corpus, as collecting photographs on active construction sites is logistically difficult and raises safety and privacy concerns. In response, we introduce ISC-Perception, the first hybrid dataset expressly designed for ISC component detection. It blends procedurally rendered CAD images, game-engine photorealistic scenes, and a limited, curated set of real photographs, enabling fully automatic labelling of the synthetic portion. We explicitly account for all human effort to produce the dataset, including simulation engine and scene setup, asset preparation, post-processing scripts and quality checks; our total human time to generate a 10,000-image dataset was 30.5,h versus 166.7,h for manual labelling at 60,s per image (-81.7%). A manual pilot on a representative image with five instances of ISC members took 60,s (maximum 80,s), anchoring the manual baseline. Detectors trained on ISC-Perception achieved a mean Average Precision at IoU 0.50 of 0.756, substantially surpassing models trained on synthetic-only or photorealistic-only data. On a 1,200-frame bench test, we report mAP@0.50/mAP@[0.50:0.95] of 0.943/0.823. By bridging the data gap for construction-robotics perception, ISC-Perception facilitates rapid development of custom object detectors and is freely available for research and industrial use upon request.

Paper Structure

This paper contains 29 sections, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Components of ISC beam-to-beam; (a) earlier version of fabricated ISC al-sabah_introduction_2020, (b) CAD drawing of ISC with single connection.
  • Figure 2: Source of images and workflow for creating the hybrid dataset combining different types of images.
  • Figure 3: View of Robotic Steel Assembly in Unity; (a) Outdoor Scene; (b) Indoor Scene
  • Figure 4: Dataset statistics; (a) number of instances for each class and (b) percentage of instances in each dataset, (c)number of instances per image for each class, (d) percentage of images from different source in dataset 3.
  • Figure 5: Representative samples from ISC-Perception: (a) Unity (built-in randomizers, C2), (b) Unity (custom randomizers, C3), (c) SolidWorks Visualize photorealistic render (C1), (d) Human example from Roboflow Universe (C5), (e) Real ISC frame (C4). Roboflow Universe aggregates contributions from multiple providers (which can include stock libraries); we therefore cite Roboflow Universe as the source for (d).
  • ...and 5 more figures