Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

Wen Wu; Bo Li; Chao Zhang; Chung-Cheng Chiu; Qiujia Li; Junwen Bai; Tara N. Sainath; Philip C. Woodland

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

Wen Wu, Bo Li, Chao Zhang, Chung-Cheng Chiu, Qiujia Li, Junwen Bai, Tara N. Sainath, Philip C. Woodland

TL;DR

Subjectivity in emotion labeling yields non-majoritarian labels ($NMA$). The authors propose three strategies: (i) adding $NMA$ as a class (MLE+) which harms MA accuracy; (ii) detecting $NMA$ as out-of-domain using evidential deep learning (EDL) to produce high uncertainty; (iii) estimating emotion as a distribution over classes via an extended EDL that consumes all annotator labels. The results show that EDL-based OOD detection retains accuracy and improves calibration and OOD metrics, and the distribution-estimation approach (EDL*) yields superior negative log-likelihood for observed annotations and provides meaningful uncertainty for emotion distributions. Tested on IEMOCAP and CREMA-D, the methods deliver a richer representation of emotional content and annotator disagreement, enabling robust emotion recognition under ambiguity and more inclusive modeling of human opinions.

Abstract

The subjective perception of emotion leads to inconsistent labels from human annotators. Typically, utterances lacking majority-agreed labels are excluded when training an emotion classifier, which cause problems when encountering ambiguous emotional expressions during testing. This paper investigates three methods to handle ambiguous emotion. First, we show that incorporating utterances without majority-agreed labels as an additional class in the classifier reduces the classification performance of the other emotion classes. Then, we propose detecting utterances with ambiguous emotions as out-of-domain samples by quantifying the uncertainty in emotion classification using evidential deep learning. This approach retains the classification accuracy while effectively detects ambiguous emotion expressions. Furthermore, to obtain fine-grained distinctions among ambiguous emotions, we propose representing emotion as a distribution instead of a single class label. The task is thus re-framed from classification to distribution estimation where every individual annotation is taken into account, not just the majority opinion. The evidential uncertainty measure is extended to quantify the uncertainty in emotion distribution estimation. Experimental results on the IEMOCAP and CREMA-D datasets demonstrate the superior capability of the proposed method in terms of majority class prediction, emotion distribution estimation, and uncertainty estimation.

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

TL;DR

Subjectivity in emotion labeling yields non-majoritarian labels (

). The authors propose three strategies: (i) adding

as a class (MLE+) which harms MA accuracy; (ii) detecting

as out-of-domain using evidential deep learning (EDL) to produce high uncertainty; (iii) estimating emotion as a distribution over classes via an extended EDL that consumes all annotator labels. The results show that EDL-based OOD detection retains accuracy and improves calibration and OOD metrics, and the distribution-estimation approach (EDL*) yields superior negative log-likelihood for observed annotations and provides meaningful uncertainty for emotion distributions. Tested on IEMOCAP and CREMA-D, the methods deliver a richer representation of emotional content and annotator disagreement, enabling robust emotion recognition under ambiguity and more inclusive modeling of human opinions.

Abstract

Paper Structure (28 sections, 14 equations, 22 figures, 6 tables)

This paper contains 28 sections, 14 equations, 22 figures, 6 tables.

Introduction
Related work
Detecting NMA as OOD by quantifying emotion classification uncertainty
Limitation of modelling class probabilities with the softmax activation function
Quantify uncertainty in emotion classification by evidential deep learning
Training
Emotion distribution estimation
Evaluation metrics
Experimental setup
Baselines
Datasets
Model structure
Results
Including NMA as an additional category degrades the performance
Detecting NMA as OOD
...and 13 more sections

Figures (22)

Figure 1: The bar chart shows the number of labels assigned by annotators to the emotion class "angry" (Ang), "frustrated" (Fru), and "neutral" (Neu) in an example. In utterance (a), eight annotators interpret the emotion as angry while one interprets it as frustrated.
Figure 2: Illustration of the model structure.
Figure 3: The change of accuracy with respect to the uncertainty threshold for EDL-based methods on IEMOCAP and CREMA-D.
Figure 4: Reject option for NLL on IEMOCAP. Trends on CREMA-D are similar, shown in Appendix \ref{['apdx: rej-cremad']}.
Figure 5: Visualisation of emotion distribution for case study. Utterance (a) from IEMOCAP. Utterance (b) from CREMA-D.
...and 17 more figures

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

TL;DR

Abstract

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

Authors

TL;DR

Abstract

Table of Contents

Figures (22)