Large Language Models on Fine-grained Emotion Detection Dataset with Data Augmentation and Transfer Learning

Kaipeng Wang; Zhi Jing; Yongye Su; Yikun Han

Large Language Models on Fine-grained Emotion Detection Dataset with Data Augmentation and Transfer Learning

Kaipeng Wang, Zhi Jing, Yongye Su, Yikun Han

TL;DR

The paper tackles fine_grained emotion detection on GoEmotions, addressing dataset challenges such as imbalance and bias, and evaluating multiple approaches from strong baselines to advanced data augmentation and cross_domain transfer. It reproduces baseline BERT results across three taxonomies, compares RoBERTa, and demonstrates that ProtAugment combined with CARER-based transfer yields the strongest gains, while traditional LLMs like GPT-4 struggle in zero_shot settings due to hallucination and mislabeling. The work provides concrete evidence that targeted data augmentation and cross_domain transfer can significantly improve macro_F1 scores on GoEmotions, and it highlights the limitations of current LLMs for fine_grained emotion labeling. Overall, the study offers practical pathways to improve emotion detection in NLP and points to a need for broader surveys across emotion datasets to synthesize methods and performances.

Abstract

This paper delves into enhancing the classification performance on the GoEmotions dataset, a large, manually annotated dataset for emotion detection in text. The primary goal of this paper is to address the challenges of detecting subtle emotions in text, a complex issue in Natural Language Processing (NLP) with significant practical applications. The findings offer valuable insights into addressing the challenges of emotion detection in text and suggest directions for future research, including the potential for a survey paper that synthesizes methods and performances across various datasets in this domain.

Large Language Models on Fine-grained Emotion Detection Dataset with Data Augmentation and Transfer Learning

TL;DR

Abstract

Paper Structure (25 sections, 4 figures, 13 tables)

This paper contains 25 sections, 4 figures, 13 tables.

Introduction
Related Work
Limitations and Hypotheses
Experiments and Results
Fine-tuning BERT on GoEmotions
Transfer Learning Experiment
Analysis and Insights
Is Fine-tuned RoBERTa A Stronger Baseline?
Data Augmentation
Duplication Data Augmentation (DDA)
Traditional Language Model Embeddings
BART Paraphraser ProtAugment
Transfer Learning
CARER Overview
Experiment Setting
...and 10 more sections

Figures (4)

Figure 1: Emotion Categories Ordered by Number of Examples in the Datasetdemszky-etal-2020-goemotions
Figure 2: Fine-tuning Losses on Three Taxonomies
Figure 3: Fine-tuning Losses on Three Taxonomies
Figure 4: Data Distribution across Datasets

Large Language Models on Fine-grained Emotion Detection Dataset with Data Augmentation and Transfer Learning

TL;DR

Abstract

Large Language Models on Fine-grained Emotion Detection Dataset with Data Augmentation and Transfer Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)