Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models
Mia Mohammad Imran
TL;DR
This work addresses emotion classification in software engineering text by systematically benchmarking six pre-trained transformer models against SEntiMoji on GitHub and Stack Overflow datasets annotated with Shaver's emotion taxonomy. It demonstrates that general-domain PTMs, notably DeBERTa and RoBERTa, provide consistent improvements over the SE-specific baselines, achieving up to 16.79% macro and 15.07% micro F1 gains, and that injecting polarity features into the attention mechanism yields additional 1.0%–10.23% improvements. The study includes thorough error analysis and highlights persistent challenges such as figurative language, pragmatics, and emoji handling, outlining future directions like hierarchical emotion modeling, ABSA-enhanced PTMs, and multi-modal approaches. Overall, the results establish PTMs as a robust means for fine-grained emotion recognition in SE contexts and offer a roadmap for enhancing contextual understanding in empathetic software systems.
Abstract
Emotion recognition in software engineering texts is critical for understanding developer expressions and improving collaboration. This paper presents a comparative analysis of state-of-the-art Pre-trained Language Models (PTMs) for fine-grained emotion classification on two benchmark datasets from GitHub and Stack Overflow. We evaluate six transformer models - BERT, RoBERTa, ALBERT, DeBERTa, CodeBERT and GraphCodeBERT against the current best-performing tool SEntiMoji. Our analysis reveals consistent improvements ranging from 1.17% to 16.79% in terms of macro-averaged and micro-averaged F1 scores, with general domain models outperforming specialized ones. To further enhance PTMs, we incorporate polarity features in attention layer during training, demonstrating additional average gains of 1.0\% to 10.23\% over baseline PTMs approaches. Our work provides strong evidence for the advancements afforded by PTMs in recognizing nuanced emotions like Anger, Love, Fear, Joy, Sadness, and Surprise in software engineering contexts. Through comprehensive benchmarking and error analysis, we also outline scope for improvements to address contextual gaps.
