Modelling Emotions is an Elusive Pursuit in Affective Computing

Anders Rolighed Larsen; Sneha Das; Line Clemmensen

Modelling Emotions is an Elusive Pursuit in Affective Computing

Anders Rolighed Larsen, Sneha Das, Line Clemmensen

Abstract

Affective computing - combining sensor technology, machine learning, and psychology - have been studied for over three decades and is employed in AI-powered technologies to enhance emotional awareness in AI systems, and detect symptoms of mental health disorders such as anxiety and depression. However, the uncertainty in such systems remains high, and the application areas are limited by categorical definitions of emotions and emotional concepts. This paper argues that categorical emotion labels obscure emotional nuance in affective computing, and therefore continuous dimensional definitions are needed to advance the field, increase application usefulness, and lower uncertainties.

Modelling Emotions is an Elusive Pursuit in Affective Computing

Abstract

Paper Structure (20 sections, 12 figures, 4 tables)

This paper contains 20 sections, 12 figures, 4 tables.

Introduction
Theoretical Foundations and Conceptual Incongruence in Emotion Modeling
Contrasting Emotional Frameworks
Representation Constraints in MSA
Conceptual Vagueness and Terminological Diffusion
Ambiguity in Emotion Datasets
Annotation Practices and Distributional Alternatives
Pragmatic Merits of Categorical Annotation
Position and Outlook
Empirical Breakdown of Categorical Emotion Assumptions
Dimensional Inconsistencies within Categorical Labels.
Modality-Specific Disagreement.
Temporal Flattening of Emotional Transitions.
Evaluation Under Ambiguity
Impact of Ambiguity on Model Performance
...and 5 more sections

Figures (12)

Figure 1: Pairwise emotion distances in VAD space: theoretical ground truth vs. IEMOCAP annotator averages. Most points fall below the diagonal, indicating that annotators perceived emotion categories as more similar than dimensional theory predicts.
Figure 2: Cross-modal prediction alignment matrices for each modality pair, with overall agreement percentages indicated in parentheses. Left: Text vs. Audio predictions. Center: Facial vs. Audio predictions. Right: Facial vs. Text predictions. Each matrix is normalized by the number of predictions per class, by the y-axis predictor, to account for imbalanced emotion distributions across modalities. A difference in class support can be observed between modality comparisons, due to faulty data-entries to the facial modality.
Figure 3: Framewise emotion probabilities (colored lines) and corresponding entropy (black line) for a representative utterance. Red x's mark emotion transitions. Note the alignment of entropy spikes with transitions, indicating periods of increased affective ambiguity.
Figure 4: Weighted F1 scores for text, audio, and facial emotion recognition models across increasing levels of agreement-based filtering. Filtering is applied based on either categorical emotion annotation (CEA, solid lines) or VAD score coherence (VAD, dashed lines). Values and partioning parameters can be found in Appendix Figures \ref{['tab:cea_f1_scores']} and \ref{['tab:vad_f1_scores']}.
Figure 5: Case of modality agreement in system predictions diverging from ground truth. All models predicted happy, while annotators labeled the utterance as angry or frustrated. Examples directly taken from IEMOCAP data set Busso2008 and used under their licensing agreement.
...and 7 more figures

Modelling Emotions is an Elusive Pursuit in Affective Computing

Abstract

Modelling Emotions is an Elusive Pursuit in Affective Computing

Authors

Abstract

Table of Contents

Figures (12)