Table of Contents
Fetching ...

The Collection of a Human Robot Collaboration Dataset for Cooperative Assembly in Glovebox Environments

Shivansh Sharma, Mathew Huang, Sanat Nair, Alan Wen, Christina Petlowany, Juston Moore, Selma Wanna, Mitch Pryor

TL;DR

This work introduces HAGS, the Hand and Glove Segmentation Dataset, the first public real-world dataset for hand and glove segmentation in glovebox-based human–robot collaboration. It provides 191 videos (~9 hours) with 1728 pixel-level annotations across 10 participants, captured from two camera angles during two surrogate assembly tasks, and includes deliberate OOD conditions (green screen, ungloved hands) to probe robustness and uncertainty. Through transfer-learning, uncertainty quantification, and skin-tone analyses, the study demonstrates that pretraining on existing datasets is insufficient for real-time industrial HRC needs and underscores the necessity of task-specific, diverse data for safe operation. The dataset and baselines, along with the accompanying datasheet, aim to advance safe, reliable real-time hand segmentation in hazardous industrial environments and guide future data collection and model development.

Abstract

Industry 4.0 introduced AI as a transformative solution for modernizing manufacturing processes. Its successor, Industry 5.0, envisions humans as collaborators and experts guiding these AI-driven manufacturing solutions. Developing these techniques necessitates algorithms capable of safe, real-time identification of human positions in a scene, particularly their hands, during collaborative assembly. Although substantial efforts have curated datasets for hand segmentation, most focus on residential or commercial domains. Existing datasets targeting industrial settings predominantly rely on synthetic data, which we demonstrate does not effectively transfer to real-world operations. Moreover, these datasets lack uncertainty estimations critical for safe collaboration. Addressing these gaps, we present HAGS: Hand and Glove Segmentation Dataset. This dataset provides challenging examples to build applications toward hand and glove segmentation in industrial human-robot collaboration scenarios as well as assess out-of-distribution images, constructed via green screen augmentations, to determine ML-classifier robustness. We study state-of-the-art, real-time segmentation models to evaluate existing methods. Our dataset and baselines are publicly available.

The Collection of a Human Robot Collaboration Dataset for Cooperative Assembly in Glovebox Environments

TL;DR

This work introduces HAGS, the Hand and Glove Segmentation Dataset, the first public real-world dataset for hand and glove segmentation in glovebox-based human–robot collaboration. It provides 191 videos (~9 hours) with 1728 pixel-level annotations across 10 participants, captured from two camera angles during two surrogate assembly tasks, and includes deliberate OOD conditions (green screen, ungloved hands) to probe robustness and uncertainty. Through transfer-learning, uncertainty quantification, and skin-tone analyses, the study demonstrates that pretraining on existing datasets is insufficient for real-time industrial HRC needs and underscores the necessity of task-specific, diverse data for safe operation. The dataset and baselines, along with the accompanying datasheet, aim to advance safe, reliable real-time hand segmentation in hazardous industrial environments and guide future data collection and model development.

Abstract

Industry 4.0 introduced AI as a transformative solution for modernizing manufacturing processes. Its successor, Industry 5.0, envisions humans as collaborators and experts guiding these AI-driven manufacturing solutions. Developing these techniques necessitates algorithms capable of safe, real-time identification of human positions in a scene, particularly their hands, during collaborative assembly. Although substantial efforts have curated datasets for hand segmentation, most focus on residential or commercial domains. Existing datasets targeting industrial settings predominantly rely on synthetic data, which we demonstrate does not effectively transfer to real-world operations. Moreover, these datasets lack uncertainty estimations critical for safe collaboration. Addressing these gaps, we present HAGS: Hand and Glove Segmentation Dataset. This dataset provides challenging examples to build applications toward hand and glove segmentation in industrial human-robot collaboration scenarios as well as assess out-of-distribution images, constructed via green screen augmentations, to determine ML-classifier robustness. We study state-of-the-art, real-time segmentation models to evaluate existing methods. Our dataset and baselines are publicly available.
Paper Structure (14 sections, 5 figures, 6 tables)

This paper contains 14 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Left: (a) Example of a glovebox suite from gb_srs_pic. Right: (b,c) Examples of RGB images from the HAGs dataset HAGS with corresponding pixel-wise labels.
  • Figure 2: Example of green screen with artificially imposed texture.
  • Figure 3: Models were pretrained on HaDr, HRC, or WH datasets then fine-tuned on varying proportions of the HAGS' training dataset. The Intersection-over-Union (IoU) metric was evaluated on the in-distribution portion of the HAGS test set. To read the legend, the ft_ prefix indicates the model was fine-tuned on the in-distribution dataset: HAGS. The rest of the file name follows this convention:{MODEL_NAME}_{PRETRAINING_DATASET}_{FINETUNING_DATASET}.
  • Figure 4: This figure provides a summary of the findings from Experiment C on skin tone representation. Analyzing raw frequency counts, the HRC dataset sajedi_uncertainty-assisted_2022 demonstrates higher absolute representation of darker skin tones. However, when normalized to proportional representation, the HAGS dataset HAGS complements WH Shilkrot2019WorkingHandsAH and HRC by exhibiting an increased focus on darker skin tone demographics, addressing gaps in representation evident in the other datasets.
  • Figure 5: Heat map of hand and glove placements during glovebox tasks in sampled frames.