Table of Contents
Fetching ...

Deep Emotion Recognition in Textual Conversations: A Survey

Patrícia Pereira, Helena Moniz, Joao Paulo Carvalho

TL;DR

The advantage of leveraging techniques to address unbalanced data, the exploration of mixed emotions, and the benefits of incorporating annotation subjectivity in the learning phase are emphasized.

Abstract

Emotion Recognition in Conversations (ERC) is a key step towards successful human-machine interaction. While the field has seen tremendous advancement in the last few years, new applications and implementation scenarios present novel challenges and opportunities. These range from leveraging the conversational context, speaker, and emotion dynamics modelling, to interpreting common sense expressions, informal language, and sarcasm, addressing challenges of real-time ERC, recognizing emotion causes, different taxonomies across datasets, multilingual ERC, and interpretability. This survey starts by introducing ERC, elaborating on the challenges and opportunities of this task. It proceeds with a description of the emotion taxonomies and a variety of ERC benchmark datasets employing such taxonomies. This is followed by descriptions comparing the most prominent works in ERC with explanations of the neural architectures employed. Then, it provides advisable ERC practices towards better frameworks, elaborating on methods to deal with subjectivity in annotations and modelling and methods to deal with the typically unbalanced ERC datasets. Finally, it presents systematic review tables comparing several works regarding the methods used and their performance. Benchmarking these works highlights resorting to pre-trained Transformer Language Models to extract utterance representations, using Gated and Graph Neural Networks to model the interactions between these utterances, and leveraging Generative Large Language Models to tackle ERC within a generative framework. This survey emphasizes the advantage of leveraging techniques to address unbalanced data, the exploration of mixed emotions, and the benefits of incorporating annotation subjectivity in the learning phase.

Deep Emotion Recognition in Textual Conversations: A Survey

TL;DR

The advantage of leveraging techniques to address unbalanced data, the exploration of mixed emotions, and the benefits of incorporating annotation subjectivity in the learning phase are emphasized.

Abstract

Emotion Recognition in Conversations (ERC) is a key step towards successful human-machine interaction. While the field has seen tremendous advancement in the last few years, new applications and implementation scenarios present novel challenges and opportunities. These range from leveraging the conversational context, speaker, and emotion dynamics modelling, to interpreting common sense expressions, informal language, and sarcasm, addressing challenges of real-time ERC, recognizing emotion causes, different taxonomies across datasets, multilingual ERC, and interpretability. This survey starts by introducing ERC, elaborating on the challenges and opportunities of this task. It proceeds with a description of the emotion taxonomies and a variety of ERC benchmark datasets employing such taxonomies. This is followed by descriptions comparing the most prominent works in ERC with explanations of the neural architectures employed. Then, it provides advisable ERC practices towards better frameworks, elaborating on methods to deal with subjectivity in annotations and modelling and methods to deal with the typically unbalanced ERC datasets. Finally, it presents systematic review tables comparing several works regarding the methods used and their performance. Benchmarking these works highlights resorting to pre-trained Transformer Language Models to extract utterance representations, using Gated and Graph Neural Networks to model the interactions between these utterances, and leveraging Generative Large Language Models to tackle ERC within a generative framework. This survey emphasizes the advantage of leveraging techniques to address unbalanced data, the exploration of mixed emotions, and the benefits of incorporating annotation subjectivity in the learning phase.
Paper Structure (62 sections, 17 equations, 7 figures, 2 tables)

This paper contains 62 sections, 17 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Evolution of the number of ERC publications across the years
  • Figure 2: Distribution of common emotions over the VAD 3-dimensional space bualan2019emotion
  • Figure 3: The Plutchik wheel of emotions. Image retrieved from Wikipedia.
  • Figure 4: A Multy Layer Perceptron with one hidden layer, input layer with feature dimension $N$ and output layer with feature dimension $C$.
  • Figure 5: Two visual descriptions of the Recurrent Neural Network. The description on the left is unrolled, highlighting that parameters $W$, $U$ and $b$ are shared between timesteps.
  • ...and 2 more figures