Table of Contents
Fetching ...

Preliminary Investigation of SSL for Complex Work Activity Recognition in Industrial Domain via MoIL

Qingxin Xia, Takuya Maekawa, Jaime Morales, Takahiro Hara, Hirotomo Oshima, Masamitsu Fukuda, Yasuo Namioka

TL;DR

MoIL addresses the challenge of recognizing complex industrial work activities from wearable sensor data when labeled data are scarce. It introduces a motif-centric self-supervised learning framework that automatically discovers characteristic action motifs and trains an encoder to reconstruct motif similarity series, with a downstream classifier using frozen features. The approach integrates motif candidate generation, similarity-based motif selection, and a CNN-BiLSTM encoder, demonstrating state-of-the-art performance on OpenPack, Logi, TestBoard, and Skoda under limited labels. Results show MoIL outperforms representative SSL baselines in both worker-dependent and worker-independent settings, indicating strong cross-worker transfer and practical potential for industrial deployment. The work advances SSL for heterogeneous, variable industrial activities by focusing on robust motif representations learned from unlabeled data.

Abstract

In this study, we investigate a new self-supervised learning (SSL) approach for complex work activity recognition using wearable sensors. Owing to the cost of labeled sensor data collection, SSL methods for human activity recognition (HAR) that effectively use unlabeled data for pretraining have attracted attention. However, applying prior SSL to complex work activities such as packaging works is challenging because the observed data vary considerably depending on situations such as the number of items to pack and the size of the items in the case of packaging works. In this study, we focus on sensor data corresponding to characteristic and necessary actions (sensor data motifs) in a specific activity such as a stretching packing tape action in an assembling a box activity, and \textcolor{black}{try} to train a neural network in self-supervised learning so that it identifies occurrences of the characteristic actions, i.e., Motif Identification Learning (MoIL). The feature extractor in the network is used in the downstream task, i.e., work activity recognition, enabling precise activity recognition containing characteristic actions with limited labeled training data. The MoIL approach was evaluated on real-world work activity data and it achieved state-of-the-art performance under limited training labels.

Preliminary Investigation of SSL for Complex Work Activity Recognition in Industrial Domain via MoIL

TL;DR

MoIL addresses the challenge of recognizing complex industrial work activities from wearable sensor data when labeled data are scarce. It introduces a motif-centric self-supervised learning framework that automatically discovers characteristic action motifs and trains an encoder to reconstruct motif similarity series, with a downstream classifier using frozen features. The approach integrates motif candidate generation, similarity-based motif selection, and a CNN-BiLSTM encoder, demonstrating state-of-the-art performance on OpenPack, Logi, TestBoard, and Skoda under limited labels. Results show MoIL outperforms representative SSL baselines in both worker-dependent and worker-independent settings, indicating strong cross-worker transfer and practical potential for industrial deployment. The work advances SSL for heterogeneous, variable industrial activities by focusing on robust motif representations learned from unlabeled data.

Abstract

In this study, we investigate a new self-supervised learning (SSL) approach for complex work activity recognition using wearable sensors. Owing to the cost of labeled sensor data collection, SSL methods for human activity recognition (HAR) that effectively use unlabeled data for pretraining have attracted attention. However, applying prior SSL to complex work activities such as packaging works is challenging because the observed data vary considerably depending on situations such as the number of items to pack and the size of the items in the case of packaging works. In this study, we focus on sensor data corresponding to characteristic and necessary actions (sensor data motifs) in a specific activity such as a stretching packing tape action in an assembling a box activity, and \textcolor{black}{try} to train a neural network in self-supervised learning so that it identifies occurrences of the characteristic actions, i.e., Motif Identification Learning (MoIL). The feature extractor in the network is used in the downstream task, i.e., work activity recognition, enabling precise activity recognition containing characteristic actions with limited labeled training data. The MoIL approach was evaluated on real-world work activity data and it achieved state-of-the-art performance under limited training labels.
Paper Structure (22 sections, 7 equations, 4 figures, 1 table)

This paper contains 22 sections, 7 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Example of an occurrence of a motif corresponds to the pick-up box action in two periods.
  • Figure 2: Overview of MoIL.
  • Figure 3: F1-measure (%) of the SSL methods for each dataset.
  • Figure 4: Average F1-measure (%) of leave-one-worker-out for OpenPack dataset (worker-independent models).