Table of Contents
Fetching ...

Deep-seeded Clustering for Emotion Recognition from Wearable Physiological Sensors

Marta A. Conceição, Antoine Dubois, Sonja Haustein, Bruno Miranda, Carlos Lima Azevedo

TL;DR

This work tackles emotion recognition from wearable physiological signals under limited labeling by introducing a deep-seeded clustering framework that jointly learns latent representations and cluster assignments. By combining a sequence-to-sequence autoencoder with seeded c-means clustering and seeding via contextual or self-reported labels, the approach yields competitive within-subject accuracies across three datasets (WESAD, Stress-Predict, CEAP360-VR). Key findings show 80.7% accuracy on WESAD, 64.2% on Stress-Predict, and 61.0% on CEAP360-VR, with sensitivity analyses indicating potential gains from hyperparameter tuning. The method enables robust emotion-state inference in naturalistic settings with minimal supervision, suggesting practical applicability for longitudinal and real-world deployments, while noting the need to address clustering assumptions and further optimization.

Abstract

According to the circumplex model of affect, an emotional response could characterized by a level of pleasure (valence) and intensity (arousal). As it reflects on the autonomic nervous system (ANS) activity, modern wearable wristbands can record non-invasively and during our everyday lives peripheral end-points of this response. While emotion recognition from physiological signals is usually achieved using supervised machine learning algorithms that require ground truth labels for training, collecting it is cumbersome and particularly unfeasible in naturalistic settings, and extracting meaningful insights from these signals requires domain knowledge and might be prone to bias. Here, we propose and test a deep-seeded clustering algorithm that automatically extracts and classifies features from those physiological signals with minimal supervision - combining an autoencoder (AE) for unsupervised feature representation and c-means clustering for fine-grained classification. We also show that the model obtains good performance results across three different datasets frequently used in affective computing studies (accuracies of 80.7% on WESAD, 64.2% on Stress-Predict and 61.0% on CEAP360-VR).

Deep-seeded Clustering for Emotion Recognition from Wearable Physiological Sensors

TL;DR

This work tackles emotion recognition from wearable physiological signals under limited labeling by introducing a deep-seeded clustering framework that jointly learns latent representations and cluster assignments. By combining a sequence-to-sequence autoencoder with seeded c-means clustering and seeding via contextual or self-reported labels, the approach yields competitive within-subject accuracies across three datasets (WESAD, Stress-Predict, CEAP360-VR). Key findings show 80.7% accuracy on WESAD, 64.2% on Stress-Predict, and 61.0% on CEAP360-VR, with sensitivity analyses indicating potential gains from hyperparameter tuning. The method enables robust emotion-state inference in naturalistic settings with minimal supervision, suggesting practical applicability for longitudinal and real-world deployments, while noting the need to address clustering assumptions and further optimization.

Abstract

According to the circumplex model of affect, an emotional response could characterized by a level of pleasure (valence) and intensity (arousal). As it reflects on the autonomic nervous system (ANS) activity, modern wearable wristbands can record non-invasively and during our everyday lives peripheral end-points of this response. While emotion recognition from physiological signals is usually achieved using supervised machine learning algorithms that require ground truth labels for training, collecting it is cumbersome and particularly unfeasible in naturalistic settings, and extracting meaningful insights from these signals requires domain knowledge and might be prone to bias. Here, we propose and test a deep-seeded clustering algorithm that automatically extracts and classifies features from those physiological signals with minimal supervision - combining an autoencoder (AE) for unsupervised feature representation and c-means clustering for fine-grained classification. We also show that the model obtains good performance results across three different datasets frequently used in affective computing studies (accuracies of 80.7% on WESAD, 64.2% on Stress-Predict and 61.0% on CEAP360-VR).
Paper Structure (19 sections, 16 equations, 15 figures, 5 tables)

This paper contains 19 sections, 16 equations, 15 figures, 5 tables.

Figures (15)

  • Figure 1: Diagram illustrating the proposed model architecture.
  • Figure 2: Sample images for each dataset: \ref{['fig:wesad_edaretrosp']} WESAD, \ref{['fig:spred_edaretrosp']} Stress-Predict, and \ref{['fig:ceapvr_edaretrosp']} CEAP360-VR.
  • Figure 3: Confusion matrix for WESAD using non-sequential 10-fold CV (within-subject), averaged across all subjects, considering contextual "stimuli" labels for seeding.
  • Figure 4: The non-sequential 10-fold CV accuracy for individual subjects of the WESAD dataset in the test (blue) and train (orange) sets.
  • Figure 5: The non-sequential 10-fold CV silhouette scores for individual subjects of the WESAD dataset.
  • ...and 10 more figures