Table of Contents
Fetching ...

Find the Assembly Mistakes: Error Segmentation for Industrial Applications

Dan Lehman, Tim J. Schoonbeek, Shao-Hsuan Hung, Jacek Kustra, Peter H. N. de With, Fons van der Sommen

TL;DR

This work tackles the problem of localizing assembly errors in industrial settings by framing it as change segmentation between an error-free anchor state and a test image. It introduces StateDiffNet, a Siamese architecture with cross-attention-based skip connections and a novel synthetic data generation pipeline that enables explicit control over meaningful state differences and imaging variability. The approach demonstrates strong generalization to unseen assembly states and error types, validated on synthetic data and real ego-centric IndustReal images, with the global cross-attention variant providing the best overall performance while revealing memorization and generalization trade-offs. The study offers practical insights into change-detection mechanisms for industry, highlights limitations of current attention-based registration, and provides publicly available code and data generation tools to support further research and deployment.

Abstract

Recognizing errors in assembly and maintenance procedures is valuable for industrial applications, since it can increase worker efficiency and prevent unplanned down-time. Although assembly state recognition is gaining attention, none of the current works investigate assembly error localization. Therefore, we propose StateDiffNet, which localizes assembly errors based on detecting the differences between a (correct) intended assembly state and a test image from a similar viewpoint. StateDiffNet is trained on synthetically generated image pairs, providing full control over the type of meaningful change that should be detected. The proposed approach is the first to correctly localize assembly errors taken from real ego-centric video data for both states and error types that are never presented during training. Furthermore, the deployment of change detection to this industrial application provides valuable insights and considerations into the mechanisms of state-of-the-art change detection algorithms. The code and data generation pipeline are publicly available at: https://timschoonbeek.github.io/error_seg.

Find the Assembly Mistakes: Error Segmentation for Industrial Applications

TL;DR

This work tackles the problem of localizing assembly errors in industrial settings by framing it as change segmentation between an error-free anchor state and a test image. It introduces StateDiffNet, a Siamese architecture with cross-attention-based skip connections and a novel synthetic data generation pipeline that enables explicit control over meaningful state differences and imaging variability. The approach demonstrates strong generalization to unseen assembly states and error types, validated on synthetic data and real ego-centric IndustReal images, with the global cross-attention variant providing the best overall performance while revealing memorization and generalization trade-offs. The study offers practical insights into change-detection mechanisms for industry, highlights limitations of current attention-based registration, and provides publicly available code and data generation tools to support further research and deployment.

Abstract

Recognizing errors in assembly and maintenance procedures is valuable for industrial applications, since it can increase worker efficiency and prevent unplanned down-time. Although assembly state recognition is gaining attention, none of the current works investigate assembly error localization. Therefore, we propose StateDiffNet, which localizes assembly errors based on detecting the differences between a (correct) intended assembly state and a test image from a similar viewpoint. StateDiffNet is trained on synthetically generated image pairs, providing full control over the type of meaningful change that should be detected. The proposed approach is the first to correctly localize assembly errors taken from real ego-centric video data for both states and error types that are never presented during training. Furthermore, the deployment of change detection to this industrial application provides valuable insights and considerations into the mechanisms of state-of-the-art change detection algorithms. The code and data generation pipeline are publicly available at: https://timschoonbeek.github.io/error_seg.
Paper Structure (34 sections, 1 equation, 15 figures)

This paper contains 34 sections, 1 equation, 15 figures.

Figures (15)

  • Figure 1: Principal view of the proposed method on an object from a complex assembly procedure industreal. The approach is evaluated on unseen assembly states and error types.
  • Figure 2: Proposed change segmentation architecture, modified from cyws, consisting of a Siamese encoder, cross-attention based feature-fusion blocks as skip connections in the U-Net-style decoder, and a two-layer convolutional segmentation head.
  • Figure 3: Two proposed variations to feature registration based on global cross-attention.
  • Figure 4: Overview of the training process. The ground-truth binary change mask of an image pair is created by taking the difference between the instance segmentation masks of the anchor and the sample image viewed from the camera angle of the anchor.
  • Figure 5: Performance of three models with varying orientation difference. The box plot shows the median and first and third quartiles.
  • ...and 10 more figures