Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation
Wen Wu, Bo Li, Chao Zhang, Chung-Cheng Chiu, Qiujia Li, Junwen Bai, Tara N. Sainath, Philip C. Woodland
TL;DR
Subjectivity in emotion labeling yields non-majoritarian labels ($NMA$). The authors propose three strategies: (i) adding $NMA$ as a class (MLE+) which harms MA accuracy; (ii) detecting $NMA$ as out-of-domain using evidential deep learning (EDL) to produce high uncertainty; (iii) estimating emotion as a distribution over classes via an extended EDL that consumes all annotator labels. The results show that EDL-based OOD detection retains accuracy and improves calibration and OOD metrics, and the distribution-estimation approach (EDL*) yields superior negative log-likelihood for observed annotations and provides meaningful uncertainty for emotion distributions. Tested on IEMOCAP and CREMA-D, the methods deliver a richer representation of emotional content and annotator disagreement, enabling robust emotion recognition under ambiguity and more inclusive modeling of human opinions.
Abstract
The subjective perception of emotion leads to inconsistent labels from human annotators. Typically, utterances lacking majority-agreed labels are excluded when training an emotion classifier, which cause problems when encountering ambiguous emotional expressions during testing. This paper investigates three methods to handle ambiguous emotion. First, we show that incorporating utterances without majority-agreed labels as an additional class in the classifier reduces the classification performance of the other emotion classes. Then, we propose detecting utterances with ambiguous emotions as out-of-domain samples by quantifying the uncertainty in emotion classification using evidential deep learning. This approach retains the classification accuracy while effectively detects ambiguous emotion expressions. Furthermore, to obtain fine-grained distinctions among ambiguous emotions, we propose representing emotion as a distribution instead of a single class label. The task is thus re-framed from classification to distribution estimation where every individual annotation is taken into account, not just the majority opinion. The evidential uncertainty measure is extended to quantify the uncertainty in emotion distribution estimation. Experimental results on the IEMOCAP and CREMA-D datasets demonstrate the superior capability of the proposed method in terms of majority class prediction, emotion distribution estimation, and uncertainty estimation.
