Gender Bias in Emotion Recognition by Large Language Models
Maureen Herbert, Katie Sun, Angelica Lim, Yasaman Etesam
TL;DR
This work investigates gender bias in emotion recognition by large language models using context-rich NarraCap captions derived from EMOTIC, comparing multiple models and debiasing strategies. The authors define an equal-emission baseline and assess bias through chi-square tests and per-emotion distributions, finding that inference-time prompt methods are generally ineffective. Training-based debiasing via data augmentation and fine-tuning (FT1, FT2) substantially reduces detectable bias across emotions, though model- and emotion-specific variability remains. The study highlights practical implications for deploying LLMs in emotion-aware applications and underscores the value of training-time interventions for fairness in emotional theory of mind. It also cautions about limitations related to dataset scope, gender representation, and environmental costs of model training.
Abstract
The rapid advancement of large language models (LLMs) and their growing integration into daily life underscore the importance of evaluating and ensuring their fairness. In this work, we examine fairness within the domain of emotional theory of mind, investigating whether LLMs exhibit gender biases when presented with a description of a person and their environment and asked, "How does this person feel?". Furthermore, we propose and evaluate several debiasing strategies, demonstrating that achieving meaningful reductions in bias requires training based interventions rather than relying solely on inference-time prompt-based approaches such as prompt engineering.
