Do LLMs Feel? Teaching Emotion Recognition with Prompts, Retrieval, and Curriculum Learning
Xinran Li, Yu Liu, Jiaqi Qiao, Xiujuan Xu
TL;DR
This work tackles Emotion Recognition in Conversation (ERC) by integrating explicit and implicit emotion interpretations into prompts, a dedicated demonstration retrieval repository, and a curriculum-based training regime for large language models. The PRC-Emo framework leverages a retrieval-template module and LoRA-finetuned LLMs to retrieve relevant demonstrations and external knowledge during inference, organized by an easy-to-hard curriculum based on weighted emotional shifts. Empirical results on IEMOCAP and MELD demonstrate state-of-the-art performance, with ablations confirming the value of each component, particularly the prompt design and cross-speaker curriculum. The approach advances ERC by combining prompt engineering, retrieval-augmented reasoning, and progressive learning, enhancing robustness and generalization across diverse conversational domains.
Abstract
Emotion Recognition in Conversation (ERC) is a crucial task for understanding human emotions and enabling natural human-computer interaction. Although Large Language Models (LLMs) have recently shown great potential in this field, their ability to capture the intrinsic connections between explicit and implicit emotions remains limited. We propose a novel ERC training framework, PRC-Emo, which integrates Prompt engineering, demonstration Retrieval, and Curriculum learning, with the goal of exploring whether LLMs can effectively perceive emotions in conversational contexts. Specifically, we design emotion-sensitive prompt templates based on both explicit and implicit emotional cues to better guide the model in understanding the speaker's psychological states. We construct the first dedicated demonstration retrieval repository for ERC, which includes training samples from widely used datasets, as well as high-quality dialogue examples generated by LLMs and manually verified. Moreover, we introduce a curriculum learning strategy into the LoRA fine-tuning process, incorporating weighted emotional shifts between same-speaker and different-speaker utterances to assign difficulty levels to dialogue samples, which are then organized in an easy-to-hard training sequence. Experimental results on two benchmark datasets -- IEMOCAP and MELD -- show that our method achieves new state-of-the-art (SOTA) performance, demonstrating the effectiveness and generalizability of our approach in improving LLM-based emotional understanding.
