Language Models (Mostly) Do Not Consider Emotion Triggers When Predicting Emotion
Smriti Singh, Cornelia Caragea, Junyi Jessy Li
TL;DR
This work interrogates whether human emotion triggers meaningfully contribute to emotion prediction by introducing EmoTrigger, a linguist-annotated dataset of 900 short social-media texts mapped across three emotion corpora. It benchmarks large language models (GPT-4, Llama2-Chat, Alpaca) and fine-tuned transformers (EmoBERTA) against unsupervised baselines (EmoLex, TopicRank) to assess trigger identification and feature salience via SHAP. The key finding is that triggers are largely not salient features for most models, with keyphrases showing much stronger alignment to model salience; GPT-4 remains the notable exception in trigger identification capability. This suggests current open-source models rely more on topical cues than genuine emotion-trigger reasoning, highlighting a gap between human appraisal-driven emotion understanding and contemporary NLP models. The EmoTrigger dataset provides a foundation for further research into trigger-aware, interpretable emotion models and their alignment with psychological theories of appraisal.
Abstract
Situations and events evoke emotions in humans, but to what extent do they inform the prediction of emotion detection models? This work investigates how well human-annotated emotion triggers correlate with features that models deemed salient in their prediction of emotions. First, we introduce a novel dataset EmoTrigger, consisting of 900 social media posts sourced from three different datasets; these were annotated by experts for emotion triggers with high agreement. Using EmoTrigger, we evaluate the ability of large language models (LLMs) to identify emotion triggers, and conduct a comparative analysis of the features considered important for these tasks between LLMs and fine-tuned models. Our analysis reveals that emotion triggers are largely not considered salient features for emotion prediction models, instead there is intricate interplay between various features and the task of emotion detection.
