The Collection of a Human Robot Collaboration Dataset for Cooperative Assembly in Glovebox Environments
Shivansh Sharma, Mathew Huang, Sanat Nair, Alan Wen, Christina Petlowany, Juston Moore, Selma Wanna, Mitch Pryor
TL;DR
This work introduces HAGS, the Hand and Glove Segmentation Dataset, the first public real-world dataset for hand and glove segmentation in glovebox-based human–robot collaboration. It provides 191 videos (~9 hours) with 1728 pixel-level annotations across 10 participants, captured from two camera angles during two surrogate assembly tasks, and includes deliberate OOD conditions (green screen, ungloved hands) to probe robustness and uncertainty. Through transfer-learning, uncertainty quantification, and skin-tone analyses, the study demonstrates that pretraining on existing datasets is insufficient for real-time industrial HRC needs and underscores the necessity of task-specific, diverse data for safe operation. The dataset and baselines, along with the accompanying datasheet, aim to advance safe, reliable real-time hand segmentation in hazardous industrial environments and guide future data collection and model development.
Abstract
Industry 4.0 introduced AI as a transformative solution for modernizing manufacturing processes. Its successor, Industry 5.0, envisions humans as collaborators and experts guiding these AI-driven manufacturing solutions. Developing these techniques necessitates algorithms capable of safe, real-time identification of human positions in a scene, particularly their hands, during collaborative assembly. Although substantial efforts have curated datasets for hand segmentation, most focus on residential or commercial domains. Existing datasets targeting industrial settings predominantly rely on synthetic data, which we demonstrate does not effectively transfer to real-world operations. Moreover, these datasets lack uncertainty estimations critical for safe collaboration. Addressing these gaps, we present HAGS: Hand and Glove Segmentation Dataset. This dataset provides challenging examples to build applications toward hand and glove segmentation in industrial human-robot collaboration scenarios as well as assess out-of-distribution images, constructed via green screen augmentations, to determine ML-classifier robustness. We study state-of-the-art, real-time segmentation models to evaluate existing methods. Our dataset and baselines are publicly available.
