ISC-Perception: A Hybrid Computer Vision Dataset for Object Detection in Novel Steel Assembly
Miftahur Rahman, Samuel Adebayo, Dorian A. Acevedo-Mejia, David Hester, Daniel McPolin, Karen Rafferty, Debra F. Laefer
TL;DR
ISC-Perception introduces a task-specific hybrid dataset for detecting Intermeshed Steel Connection (ISC) components in robotic steel erection. By merging photorealistic renders from SolidWorks Visualize, synthetic Unity scenes with custom randomizers, and a curated set of real images, the approach reduces manual labeling effort while improving generalization across diverse site conditions. Empirical evaluation shows the hybrid dataset yields the best detection performance (e.g., mAP@0.5-0.95 up to 0.664), outperforming synthetic-only and photorealistic-only baselines and achieving strong bench-test results (mAP@0.5 ≈ 0.943). The work demonstrates a scalable pathway to accelerate perception model development for construction robotics and offers the dataset to researchers and industry upon request.
Abstract
The Intermeshed Steel Connection (ISC) system, when paired with robotic manipulators, can accelerate steel-frame assembly and improve worker safety by eliminating manual assembly. Dependable perception is one of the initial stages for ISC-aware robots. However, this is hampered by the absence of a dedicated image corpus, as collecting photographs on active construction sites is logistically difficult and raises safety and privacy concerns. In response, we introduce ISC-Perception, the first hybrid dataset expressly designed for ISC component detection. It blends procedurally rendered CAD images, game-engine photorealistic scenes, and a limited, curated set of real photographs, enabling fully automatic labelling of the synthetic portion. We explicitly account for all human effort to produce the dataset, including simulation engine and scene setup, asset preparation, post-processing scripts and quality checks; our total human time to generate a 10,000-image dataset was 30.5,h versus 166.7,h for manual labelling at 60,s per image (-81.7%). A manual pilot on a representative image with five instances of ISC members took 60,s (maximum 80,s), anchoring the manual baseline. Detectors trained on ISC-Perception achieved a mean Average Precision at IoU 0.50 of 0.756, substantially surpassing models trained on synthetic-only or photorealistic-only data. On a 1,200-frame bench test, we report mAP@0.50/mAP@[0.50:0.95] of 0.943/0.823. By bridging the data gap for construction-robotics perception, ISC-Perception facilitates rapid development of custom object detectors and is freely available for research and industrial use upon request.
