AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language Models
Xin Hong, Yuan Gong, Vidhyasaharan Sethu, Ting Dang
TL;DR
The paper tackles ambiguity in emotion labeling by leveraging large language models (LLMs) to predict full emotion distributions $\hat{p}(x)$ rather than single labels, comparing them to ground-truth distributions $p(x)$ inferred from multiple annotators. It introduces zero-shot and few-shot prompting augmented with contextual dialogue history and, in some cases, speech features represented textually, to enhance in-context learning. Across MSP-Podcast, IEMOCAP, and GoEmotions, the approach yields significant improvements in uncertainty-calibrated metrics (e.g., Jensen-Shannon divergence, Bhattacharyya coefficient, $R^2$, and calibration error) and shows clear benefits from including context windows of size $M$ (optimally around 10–20) and multimodal prompts. The findings indicate that LLMs are more effective for less ambiguous emotions and offer a pathway toward more natural, emotion-aware conversational AI with practical implications for adaptive communication strategies.
Abstract
Recent advancements in Large Language Models (LLMs) have demonstrated great success in many Natural Language Processing (NLP) tasks. In addition to their cognitive intelligence, exploring their capabilities in emotional intelligence is also crucial, as it enables more natural and empathetic conversational AI. Recent studies have shown LLMs' capability in recognizing emotions, but they often focus on single emotion labels and overlook the complex and ambiguous nature of human emotions. This study is the first to address this gap by exploring the potential of LLMs in recognizing ambiguous emotions, leveraging their strong generalization capabilities and in-context learning. We design zero-shot and few-shot prompting and incorporate past dialogue as context information for ambiguous emotion recognition. Experiments conducted using three datasets indicate significant potential for LLMs in recognizing ambiguous emotions, and highlight the substantial benefits of including context information. Furthermore, our findings indicate that LLMs demonstrate a high degree of effectiveness in recognizing less ambiguous emotions and exhibit potential for identifying more ambiguous emotions, paralleling human perceptual capabilities.
