Table of Contents
Fetching ...

Towards More Accurate Prediction of Human Empathy and Emotion in Text and Multi-turn Conversations by Combining Advanced NLP, Transformers-based Networks, and Linguistic Methodologies

Manisha Singh, Divy Sharma, Alonso Ma, Nora Goldfine

TL;DR

This work tackles the prediction of human empathy and emotion in text and multi-turn conversations by leveraging a multi-stage pipeline that combines advanced NLP embeddings, a feed-forward neural network, and lexicon-based feature enrichment. Starting from a baseline FFN using sentence-level embeddings, the authors iteratively improve through hyperparameter tuning, stratified sampling to address class imbalance, and the integration of lexical resources, culminating in an ensemble that blends neural and SVR models. The approach is validated on the WASSA 2022 empathy and distress task and adapted to the WASSA 2023 empathy, emotion polarity, and intensity task in conversations, achieving substantial gains over baselines and competitive performance on held-out test sets. The results demonstrate that combining diverse embeddings, linguistic features, and ensembling can robustly predict nuanced affective states in both essay- and turn-level discourse, with practical implications for empathetic AI agents and affect-aware chat systems.

Abstract

Based on the WASSA 2022 Shared Task on Empathy Detection and Emotion Classification, we predict the level of empathic concern and personal distress displayed in essays. For the first stage of this project we implemented a Feed-Forward Neural Network using sentence-level embeddings as features. We experimented with four different embedding models for generating the inputs to the neural network. The subsequent stage builds upon the previous work and we have implemented three types of revisions. The first revision focuses on the enhancements to the model architecture and the training approach. The second revision focuses on handling class imbalance using stratified data sampling. The third revision focuses on leveraging lexical resources, where we apply four different resources to enrich the features associated with the dataset. During the final stage of this project, we have created the final end-to-end system for the primary task using an ensemble of models to revise primary task performance. Additionally, as part of the final stage, these approaches have been adapted to the WASSA 2023 Shared Task on Empathy Emotion and Personality Detection in Interactions, in which the empathic concern, emotion polarity, and emotion intensity in dyadic text conversations are predicted.

Towards More Accurate Prediction of Human Empathy and Emotion in Text and Multi-turn Conversations by Combining Advanced NLP, Transformers-based Networks, and Linguistic Methodologies

TL;DR

This work tackles the prediction of human empathy and emotion in text and multi-turn conversations by leveraging a multi-stage pipeline that combines advanced NLP embeddings, a feed-forward neural network, and lexicon-based feature enrichment. Starting from a baseline FFN using sentence-level embeddings, the authors iteratively improve through hyperparameter tuning, stratified sampling to address class imbalance, and the integration of lexical resources, culminating in an ensemble that blends neural and SVR models. The approach is validated on the WASSA 2022 empathy and distress task and adapted to the WASSA 2023 empathy, emotion polarity, and intensity task in conversations, achieving substantial gains over baselines and competitive performance on held-out test sets. The results demonstrate that combining diverse embeddings, linguistic features, and ensembling can robustly predict nuanced affective states in both essay- and turn-level discourse, with practical implications for empathetic AI agents and affect-aware chat systems.

Abstract

Based on the WASSA 2022 Shared Task on Empathy Detection and Emotion Classification, we predict the level of empathic concern and personal distress displayed in essays. For the first stage of this project we implemented a Feed-Forward Neural Network using sentence-level embeddings as features. We experimented with four different embedding models for generating the inputs to the neural network. The subsequent stage builds upon the previous work and we have implemented three types of revisions. The first revision focuses on the enhancements to the model architecture and the training approach. The second revision focuses on handling class imbalance using stratified data sampling. The third revision focuses on leveraging lexical resources, where we apply four different resources to enrich the features associated with the dataset. During the final stage of this project, we have created the final end-to-end system for the primary task using an ensemble of models to revise primary task performance. Additionally, as part of the final stage, these approaches have been adapted to the WASSA 2023 Shared Task on Empathy Emotion and Personality Detection in Interactions, in which the empathic concern, emotion polarity, and emotion intensity in dyadic text conversations are predicted.
Paper Structure (30 sections, 14 figures, 6 tables)

This paper contains 30 sections, 14 figures, 6 tables.

Figures (14)

  • Figure 1: Architecture Overview.
  • Figure 2: Distribution of Empathy and Distress values in the training dataset for the primary task, indicating an imbalance in the distribution of samples
  • Figure 3: Distribution of Empathy, Emotion Polarity, and Emotion Intensity values in the training dataset for the adaptation task, indicating an imbalance in the distribution of samples
  • Figure 4: Samples from the Training Dataset with Embeddings (Sentence Transformer)
  • Figure 5: Training and validation losses: \ref{['subfig:1']} Before hyperparameter tuning \ref{['subfig:2']} After hyperparameter tuning
  • ...and 9 more figures