Table of Contents
Fetching ...

Detecting Clues for Skill Levels and Machine Operation Difficulty from Egocentric Vision

Chen Long-fei, Yuichi Nakamura, Kazuaki Kondo

TL;DR

This work tackles how to infer operator skill level and machine-operation difficulty from egocentric vision during a sewing task. It introduces automatic analysis centered on the relationships among attention (gaze), hand, and hotspot using a head-mounted RGB-D camera, defining operation units and extracting features such as durations, distances $d_{AO}$, $d_{HO}$, and $d_{AH}$, speeds, correlations, and early-shift metrics. The study finds that pure-gazing duration declines with skill, while hand-approaching duration and attention movement frequency correlate with operation difficulty; early-shift patterns become more pronounced with familiarity, and the hand–attention spatial relationships remain consistent across skill levels. These results point to practical opportunities for task modeling, guidance design, and automated skill assessment, though the authors acknowledge the need for larger datasets and further validation to generalize the approach.

Abstract

With respect to machine operation tasks, the experiences from different skill level operators, especially novices, can provide worthy understanding about the manner in which they perceive the operational environment and formulate knowledge to deal with various operation situations. In this study, we describe the operator's behaviors by utilizing the relations among their head, hand, and operation location (hotspot) during the operation. A total of 40 experiences associated with a sewing machine operation task performed by amateur operators was recorded via a head-mounted RGB-D camera. We examined important features of operational behaviors in different skill level operators and confirmed their correlation to the difficulties of the operation steps. The result shows that the pure-gazing behavior is significantly reduced when the operator's skill improved. Moreover, the hand-approaching duration and the frequency of attention movement before operation are strongly correlated to the operational difficulty in such machine operating environments.

Detecting Clues for Skill Levels and Machine Operation Difficulty from Egocentric Vision

TL;DR

This work tackles how to infer operator skill level and machine-operation difficulty from egocentric vision during a sewing task. It introduces automatic analysis centered on the relationships among attention (gaze), hand, and hotspot using a head-mounted RGB-D camera, defining operation units and extracting features such as durations, distances , , and , speeds, correlations, and early-shift metrics. The study finds that pure-gazing duration declines with skill, while hand-approaching duration and attention movement frequency correlate with operation difficulty; early-shift patterns become more pronounced with familiarity, and the hand–attention spatial relationships remain consistent across skill levels. These results point to practical opportunities for task modeling, guidance design, and automated skill assessment, though the authors acknowledge the need for larger datasets and further validation to generalize the approach.

Abstract

With respect to machine operation tasks, the experiences from different skill level operators, especially novices, can provide worthy understanding about the manner in which they perceive the operational environment and formulate knowledge to deal with various operation situations. In this study, we describe the operator's behaviors by utilizing the relations among their head, hand, and operation location (hotspot) during the operation. A total of 40 experiences associated with a sewing machine operation task performed by amateur operators was recorded via a head-mounted RGB-D camera. We examined important features of operational behaviors in different skill level operators and confirmed their correlation to the difficulties of the operation steps. The result shows that the pure-gazing behavior is significantly reduced when the operator's skill improved. Moreover, the hand-approaching duration and the frequency of attention movement before operation are strongly correlated to the operational difficulty in such machine operating environments.

Paper Structure

This paper contains 10 sections, 3 equations, 3 figures.

Figures (3)

  • Figure 1: [L] The accumulated touches of 40 sewing experiences (green) are distributed around the center of FPV camera view (red) with a little bias. [R] The 2D distances between the hand (red), view center (blue) and the hotspot (green) are automatically extracted to describe operation behaviors, indicating operation difficulties and user skills.
  • Figure 2: Different patterns of gaze behaviors. Left sub-figures show the distances of estimated attention and hand respect to the hotspot (as coordinate origin). The black-dotted line shows the lag of attention precedes hand that approaches the hotspot. $d_{HO}$ is set as 0 when the hotspot is touched by hand. Right sub-figures denote the sign of distance change. In search (a), the sign of attention--hotspot distance ($d_{AO}$) changes frequently; however, in the case of gaze shift (b), $d_{AO}$ simply decreases. Figure (c) and (d) show early shift and non-early shift behaviors, distinguished by whether a continuous increase of $d_{AO}$ in the late period of operation.
  • Figure 3: (a) Comparison of features between earlier and later experiences (averaged with 20 pairs). The average of all earlier experiences are set as the baseline (0%), and the relative ratios of later experiences are shown for comparison. (b) The correlation of features to user-rated operation difficulties, where G, H, and O denote the pure-gazing, hand approaching, and operating periods, respectively.