Table of Contents
Fetching ...

Facial Movement Dynamics Reveal Workload During Complex Multitasking

Carter Sale, Melissa N. Stolar, Gaurav Patil, Michael J. Gostelow, Julia Wallier, Margaret C. Macpherson, Jan-Louis Kruger, Mark Dras, Simon G. Hosking, Rachel W. Kallen, Michael J. Richardson

Abstract

Real-time cognitive workload monitoring is crucial in safety-critical environments, yet established measures are intrusive, expensive, or lack temporal resolution. We tested whether facial movement dynamics from a standard webcam could provide a low-cost alternative. Seventy-two participants completed a multitasking simulation (OpenMATB) under varied load while facial keypoints were tracked via OpenPose. Linear kinematics (velocity, acceleration, displacement) and recurrence quantification features were extracted. Increasing load altered dynamics across timescales: movement magnitudes rose, temporal organisation fragmented then reorganised into complex patterns, and eye-head coordination weakened. Random forest classifiers trained on pose kinematics outperformed task performance metrics (85% vs. 55% accuracy) but generalised poorly across participants (43% vs. 33% chance). Participant-specific models reached 50% accuracy with minimal calibration (2 minutes per condition), improving continuously to 73% without plateau. Facial movement dynamics sensitively track workload with brief calibration, enabling adaptive interfaces using commodity cameras, though individual differences limit cross-participant generalisation.

Facial Movement Dynamics Reveal Workload During Complex Multitasking

Abstract

Real-time cognitive workload monitoring is crucial in safety-critical environments, yet established measures are intrusive, expensive, or lack temporal resolution. We tested whether facial movement dynamics from a standard webcam could provide a low-cost alternative. Seventy-two participants completed a multitasking simulation (OpenMATB) under varied load while facial keypoints were tracked via OpenPose. Linear kinematics (velocity, acceleration, displacement) and recurrence quantification features were extracted. Increasing load altered dynamics across timescales: movement magnitudes rose, temporal organisation fragmented then reorganised into complex patterns, and eye-head coordination weakened. Random forest classifiers trained on pose kinematics outperformed task performance metrics (85% vs. 55% accuracy) but generalised poorly across participants (43% vs. 33% chance). Participant-specific models reached 50% accuracy with minimal calibration (2 minutes per condition), improving continuously to 73% without plateau. Facial movement dynamics sensitively track workload with brief calibration, enabling adaptive interfaces using commodity cameras, though individual differences limit cross-participant generalisation.
Paper Structure (38 sections, 7 figures, 1 table)

This paper contains 38 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Overview of the OpenMATB experimental setup. (A) Screenshot of the OpenMATB interface, displaying the four concurrent tasks: System Monitoring, Tracking, Communications, and Resource Management. (B) A participant performing the task using a joystick and monitor setup, with simultaneous recording from a webcam.
  • Figure 2: Top: Mean accuracy (%) and reaction time (ms) across OpenMATB subtasks and load levels for baseline (top row) and experimental (bottom row) blocks. Subtasks include Tracking, Resource Management (ResMan), System Monitoring (SysMon), and Communications (Comms), with average reaction time (Avg RT) shown in the rightmost column. Bars represent means; error bars indicate $\pm1$ SEM. Load conditions are colour-coded: Low (blue), Moderate (yellow), and High (red). Bottom: Pairwise Pearson correlations between subtask performance metrics across load conditions. Red asterisks indicate significant correlations ($p < .05$). Warm colours represent stronger positive correlations.
  • Figure 3: Bar plots show mean values across load conditions (Low, Moderate, High) for four key facial movement features. Top-left: Pupil Velocity (RMS) increased progressively, particularly from moderate to high load, reflecting increased vertical visual scanning. Top right: Mouth aperture (mean) increased steadily across both load transitions, consistent with greater communication task demands. Bottom left: Blink aperture (mean) decreased progressively, indicating narrowed eye opening under load. Bottom right: Head rotation velocity (mean) showed a threshold effect, increasing only at high load. Error bars represent $\pm1$ SEM.
  • Figure 4: Head and mouth dynamics show opposite trajectories in recurrence (%REC) and determinism (%DET) across load levels. From low to moderate load, head motion became less repetitive and predictable, reflecting fragmentation of movement patterns. At high load, head motion determinism rebounds, indicating reorganisation into more structured dynamics. In contrast, mouth aperture becomes increasingly repetitive and structured with rising load. Bars represent means $\pm$ SEM.
  • Figure 5: Cross-recurrence between head position and gaze decreases with cognitive load. Left panel: Mean percentage of recurrent points (%REC) in cross-recurrence quantification analysis (CRQA) between horizontal head movement (head translation X) and horizontal pupil displacement across low, moderate, and high workload conditions. Error bars represent standard error of the mean. Right panels: Representative cross-recurrence plots from a single participant showing the same time window under low (middle) and high (right) cognitive load conditions. Each red point indicates a moment when head and gaze movements exhibited similar dynamical patterns (recurrent states). The denser recurrence structure in the low condition reflects more consistent coordination between head and gaze movements, while the sparser structure in the high condition indicates reduced coupling as task demands increase. Axes show time-delayed embedded dimensions of the respective time series.
  • ...and 2 more figures