Table of Contents
Fetching ...

ESIHGNN: Event-State Interactions Infused Heterogeneous Graph Neural Network for Conversational Emotion Recognition

Xupeng Zha, Huan Zhao, Zixing Zhang

TL;DR

ESIHGNN addresses conversational emotion recognition by incorporating the speaker's emotional state into a heterogeneous event-state graph and updating representations turn-by-turn with a heterogeneous directed acyclic graph neural network. Edges are enriched with external knowledge via COMET to capture nuanced interactions between events and emotions. The approach achieves competitive or state-of-the-art performance on four benchmark datasets, demonstrating the value of explicit state-event interactions and knowledge integration for real-time CER. This work advances emotion-aware dialogue understanding and suggests future directions in building emotion-centric knowledge graphs.

Abstract

Conversational Emotion Recognition (CER) aims to predict the emotion expressed by an utterance (referred to as an ``event'') during a conversation. Existing graph-based methods mainly focus on event interactions to comprehend the conversational context, while overlooking the direct influence of the speaker's emotional state on the events. In addition, real-time modeling of the conversation is crucial for real-world applications but is rarely considered. Toward this end, we propose a novel graph-based approach, namely Event-State Interactions infused Heterogeneous Graph Neural Network (ESIHGNN), which incorporates the speaker's emotional state and constructs a heterogeneous event-state interaction graph to model the conversation. Specifically, a heterogeneous directed acyclic graph neural network is employed to dynamically update and enhance the representations of events and emotional states at each turn, thereby improving conversational coherence and consistency. Furthermore, to further improve the performance of CER, we enrich the graph's edges with external knowledge. Experimental results on four publicly available CER datasets show the superiority of our approach and the effectiveness of the introduced heterogeneous event-state interaction graph.

ESIHGNN: Event-State Interactions Infused Heterogeneous Graph Neural Network for Conversational Emotion Recognition

TL;DR

ESIHGNN addresses conversational emotion recognition by incorporating the speaker's emotional state into a heterogeneous event-state graph and updating representations turn-by-turn with a heterogeneous directed acyclic graph neural network. Edges are enriched with external knowledge via COMET to capture nuanced interactions between events and emotions. The approach achieves competitive or state-of-the-art performance on four benchmark datasets, demonstrating the value of explicit state-event interactions and knowledge integration for real-time CER. This work advances emotion-aware dialogue understanding and suggests future directions in building emotion-centric knowledge graphs.

Abstract

Conversational Emotion Recognition (CER) aims to predict the emotion expressed by an utterance (referred to as an ``event'') during a conversation. Existing graph-based methods mainly focus on event interactions to comprehend the conversational context, while overlooking the direct influence of the speaker's emotional state on the events. In addition, real-time modeling of the conversation is crucial for real-world applications but is rarely considered. Toward this end, we propose a novel graph-based approach, namely Event-State Interactions infused Heterogeneous Graph Neural Network (ESIHGNN), which incorporates the speaker's emotional state and constructs a heterogeneous event-state interaction graph to model the conversation. Specifically, a heterogeneous directed acyclic graph neural network is employed to dynamically update and enhance the representations of events and emotional states at each turn, thereby improving conversational coherence and consistency. Furthermore, to further improve the performance of CER, we enrich the graph's edges with external knowledge. Experimental results on four publicly available CER datasets show the superiority of our approach and the effectiveness of the introduced heterogeneous event-state interaction graph.
Paper Structure (11 sections, 9 equations, 1 figure, 3 tables)

This paper contains 11 sections, 9 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: The introduced ESIHGNN framework, where we present the interactions between nodes of the 3rd turn and their predecessors and successors in a conversation example.