Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English
Rebecca Dorn, Christina Chance, Casandra Rusti, Charles Bickham, Kai-Wei Chang, Fred Morstatter, Kristina Lerman
TL;DR
This work investigates how emotion recognition systems treat African American Vernacular English (AAVE) and exposes biases that can reinforce racial stereotypes. It introduces a Dialect Density Metric (DDM) to quantify AAVE strength in 2.7M Los Angeles tweets and collects ingroup/in-group silver labels on 875 samples to create culturally grounded ground truth. Across a suite of lexicon, transformer, and generative models, the study finds substantially higher false positives for anger (and disgust) in AAVE text, with anger FP rates tripling or more compared to General American English, and highlights strong correlations between predictions and profanity-based AAVE features. The results argue for dialect-aware, community-informed affective computing and caution against deploying emotion AI without culturally informed ground truth and evaluation, including demographic-context considerations.
Abstract
Automated emotion detection is widely used in applications ranging from well-being monitoring to high-stakes domains like mental health and hiring. However, models often rely on annotations that reflect dominant cultural norms, limiting model ability to recognize emotional expression in dialects often excluded from training data distributions, such as African American Vernacular English (AAVE). This study examines emotion recognition model performance on AAVE compared to General American English (GAE). We analyze 2.7 million tweets geo-tagged within Los Angeles. Texts are scored for strength of AAVE using computational approximations of dialect features. Annotations of emotion presence and intensity are collected on a dataset of 875 tweets with both high and low AAVE densities. To assess model accuracy on a task as subjective as emotion perception, we calculate community-informed "silver" labels where AAVE-dense tweets are labeled by African American, AAVE-fluent (ingroup) annotators. On our labeled sample, GPT and BERT-based models exhibit false positive prediction rates of anger on AAVE more than double than on GAE. SpanEmo, a popular text-based emotion model, increases false positive rates of anger from 25 percent on GAE to 60 percent on AAVE. Additionally, a series of linear regressions reveals that models and non-ingroup annotations are significantly more correlated with profanity-based AAVE features than ingroup annotations. Linking Census tract demographics, we observe that neighborhoods with higher proportions of African American residents are associated with higher predictions of anger (Pearson's correlation r = 0.27) and lower joy (r = -0.10). These results find an emergent safety issue of emotion AI reinforcing racial stereotypes through biased emotion classification. We emphasize the need for culturally and dialect-informed affective computing systems.
