Towards More Accurate Prediction of Human Empathy and Emotion in Text and Multi-turn Conversations by Combining Advanced NLP, Transformers-based Networks, and Linguistic Methodologies
Manisha Singh, Divy Sharma, Alonso Ma, Nora Goldfine
TL;DR
This work tackles the prediction of human empathy and emotion in text and multi-turn conversations by leveraging a multi-stage pipeline that combines advanced NLP embeddings, a feed-forward neural network, and lexicon-based feature enrichment. Starting from a baseline FFN using sentence-level embeddings, the authors iteratively improve through hyperparameter tuning, stratified sampling to address class imbalance, and the integration of lexical resources, culminating in an ensemble that blends neural and SVR models. The approach is validated on the WASSA 2022 empathy and distress task and adapted to the WASSA 2023 empathy, emotion polarity, and intensity task in conversations, achieving substantial gains over baselines and competitive performance on held-out test sets. The results demonstrate that combining diverse embeddings, linguistic features, and ensembling can robustly predict nuanced affective states in both essay- and turn-level discourse, with practical implications for empathetic AI agents and affect-aware chat systems.
Abstract
Based on the WASSA 2022 Shared Task on Empathy Detection and Emotion Classification, we predict the level of empathic concern and personal distress displayed in essays. For the first stage of this project we implemented a Feed-Forward Neural Network using sentence-level embeddings as features. We experimented with four different embedding models for generating the inputs to the neural network. The subsequent stage builds upon the previous work and we have implemented three types of revisions. The first revision focuses on the enhancements to the model architecture and the training approach. The second revision focuses on handling class imbalance using stratified data sampling. The third revision focuses on leveraging lexical resources, where we apply four different resources to enrich the features associated with the dataset. During the final stage of this project, we have created the final end-to-end system for the primary task using an ensemble of models to revise primary task performance. Additionally, as part of the final stage, these approaches have been adapted to the WASSA 2023 Shared Task on Empathy Emotion and Personality Detection in Interactions, in which the empathic concern, emotion polarity, and emotion intensity in dyadic text conversations are predicted.
