Table of Contents
Fetching ...

BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks

Jieying Xue, Minh Phuong Nguyen, Blake Matheny, Le Minh Nguyen

TL;DR

This work tackles emotion recognition in conversation by introducing BiosERC, which leverages speaker biographies extracted from conversations via prompting LLMs and injects this external knowledge into ERC models. The approach is instantiated in two architectures—BERT-based and LLM-based biosERC with instruction fine-tuning—demonstrating state-of-the-art performance on three benchmarks (IEMOCAP, MELD, EmoryNLP). Ablation experiments confirm the pivotal role of speaker biographies and the efficacy of attention-based biography integration, while analyses show greater benefits for short conversations where contextual cues are limited. The method remains broadly applicable to other conversation-analysis tasks and highlights the practical potential of combining LLM-derived personality signals with established ERC architectures, albeit with added computation and privacy considerations.

Abstract

In the Emotion Recognition in Conversation task, recent investigations have utilized attention mechanisms exploring relationships among utterances from intra- and inter-speakers for modeling emotional interaction between them. However, attributes such as speaker personality traits remain unexplored and present challenges in terms of their applicability to other tasks or compatibility with diverse model architectures. Therefore, this work introduces a novel framework named BiosERC, which investigates speaker characteristics in a conversation. By employing Large Language Models (LLMs), we extract the "biographical information" of the speaker within a conversation as supplementary knowledge injected into the model to classify emotional labels for each utterance. Our proposed method achieved state-of-the-art (SOTA) results on three famous benchmark datasets: IEMOCAP, MELD, and EmoryNLP, demonstrating the effectiveness and generalization of our model and showcasing its potential for adaptation to various conversation analysis tasks. Our source code is available at https://github.com/yingjie7/BiosERC.

BiosERC: Integrating Biography Speakers Supported by LLMs for ERC Tasks

TL;DR

This work tackles emotion recognition in conversation by introducing BiosERC, which leverages speaker biographies extracted from conversations via prompting LLMs and injects this external knowledge into ERC models. The approach is instantiated in two architectures—BERT-based and LLM-based biosERC with instruction fine-tuning—demonstrating state-of-the-art performance on three benchmarks (IEMOCAP, MELD, EmoryNLP). Ablation experiments confirm the pivotal role of speaker biographies and the efficacy of attention-based biography integration, while analyses show greater benefits for short conversations where contextual cues are limited. The method remains broadly applicable to other conversation-analysis tasks and highlights the practical potential of combining LLM-derived personality signals with established ERC architectures, albeit with added computation and privacy considerations.

Abstract

In the Emotion Recognition in Conversation task, recent investigations have utilized attention mechanisms exploring relationships among utterances from intra- and inter-speakers for modeling emotional interaction between them. However, attributes such as speaker personality traits remain unexplored and present challenges in terms of their applicability to other tasks or compatibility with diverse model architectures. Therefore, this work introduces a novel framework named BiosERC, which investigates speaker characteristics in a conversation. By employing Large Language Models (LLMs), we extract the "biographical information" of the speaker within a conversation as supplementary knowledge injected into the model to classify emotional labels for each utterance. Our proposed method achieved state-of-the-art (SOTA) results on three famous benchmark datasets: IEMOCAP, MELD, and EmoryNLP, demonstrating the effectiveness and generalization of our model and showcasing its potential for adaptation to various conversation analysis tasks. Our source code is available at https://github.com/yingjie7/BiosERC.
Paper Structure (27 sections, 8 equations, 4 figures, 6 tables)

This paper contains 27 sections, 8 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Overview of our BiosERC framework
  • Figure 2: Overview of our BiosERC model architecture.
  • Figure 3: Performance comparison between our BERT-based BiosERC and the baseline model (MELD dev set), illustrating the performance variability across 10 random runs.
  • Figure 4: Performance comparison respect to length of conversation (number of utterance) on the MELD development set (variability across 10 random runs).