Table of Contents
Fetching ...

Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English

Rebecca Dorn, Christina Chance, Casandra Rusti, Charles Bickham, Kai-Wei Chang, Fred Morstatter, Kristina Lerman

TL;DR

This work investigates how emotion recognition systems treat African American Vernacular English (AAVE) and exposes biases that can reinforce racial stereotypes. It introduces a Dialect Density Metric (DDM) to quantify AAVE strength in 2.7M Los Angeles tweets and collects ingroup/in-group silver labels on 875 samples to create culturally grounded ground truth. Across a suite of lexicon, transformer, and generative models, the study finds substantially higher false positives for anger (and disgust) in AAVE text, with anger FP rates tripling or more compared to General American English, and highlights strong correlations between predictions and profanity-based AAVE features. The results argue for dialect-aware, community-informed affective computing and caution against deploying emotion AI without culturally informed ground truth and evaluation, including demographic-context considerations.

Abstract

Automated emotion detection is widely used in applications ranging from well-being monitoring to high-stakes domains like mental health and hiring. However, models often rely on annotations that reflect dominant cultural norms, limiting model ability to recognize emotional expression in dialects often excluded from training data distributions, such as African American Vernacular English (AAVE). This study examines emotion recognition model performance on AAVE compared to General American English (GAE). We analyze 2.7 million tweets geo-tagged within Los Angeles. Texts are scored for strength of AAVE using computational approximations of dialect features. Annotations of emotion presence and intensity are collected on a dataset of 875 tweets with both high and low AAVE densities. To assess model accuracy on a task as subjective as emotion perception, we calculate community-informed "silver" labels where AAVE-dense tweets are labeled by African American, AAVE-fluent (ingroup) annotators. On our labeled sample, GPT and BERT-based models exhibit false positive prediction rates of anger on AAVE more than double than on GAE. SpanEmo, a popular text-based emotion model, increases false positive rates of anger from 25 percent on GAE to 60 percent on AAVE. Additionally, a series of linear regressions reveals that models and non-ingroup annotations are significantly more correlated with profanity-based AAVE features than ingroup annotations. Linking Census tract demographics, we observe that neighborhoods with higher proportions of African American residents are associated with higher predictions of anger (Pearson's correlation r = 0.27) and lower joy (r = -0.10). These results find an emergent safety issue of emotion AI reinforcing racial stereotypes through biased emotion classification. We emphasize the need for culturally and dialect-informed affective computing systems.

Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English

TL;DR

This work investigates how emotion recognition systems treat African American Vernacular English (AAVE) and exposes biases that can reinforce racial stereotypes. It introduces a Dialect Density Metric (DDM) to quantify AAVE strength in 2.7M Los Angeles tweets and collects ingroup/in-group silver labels on 875 samples to create culturally grounded ground truth. Across a suite of lexicon, transformer, and generative models, the study finds substantially higher false positives for anger (and disgust) in AAVE text, with anger FP rates tripling or more compared to General American English, and highlights strong correlations between predictions and profanity-based AAVE features. The results argue for dialect-aware, community-informed affective computing and caution against deploying emotion AI without culturally informed ground truth and evaluation, including demographic-context considerations.

Abstract

Automated emotion detection is widely used in applications ranging from well-being monitoring to high-stakes domains like mental health and hiring. However, models often rely on annotations that reflect dominant cultural norms, limiting model ability to recognize emotional expression in dialects often excluded from training data distributions, such as African American Vernacular English (AAVE). This study examines emotion recognition model performance on AAVE compared to General American English (GAE). We analyze 2.7 million tweets geo-tagged within Los Angeles. Texts are scored for strength of AAVE using computational approximations of dialect features. Annotations of emotion presence and intensity are collected on a dataset of 875 tweets with both high and low AAVE densities. To assess model accuracy on a task as subjective as emotion perception, we calculate community-informed "silver" labels where AAVE-dense tweets are labeled by African American, AAVE-fluent (ingroup) annotators. On our labeled sample, GPT and BERT-based models exhibit false positive prediction rates of anger on AAVE more than double than on GAE. SpanEmo, a popular text-based emotion model, increases false positive rates of anger from 25 percent on GAE to 60 percent on AAVE. Additionally, a series of linear regressions reveals that models and non-ingroup annotations are significantly more correlated with profanity-based AAVE features than ingroup annotations. Linking Census tract demographics, we observe that neighborhoods with higher proportions of African American residents are associated with higher predictions of anger (Pearson's correlation r = 0.27) and lower joy (r = -0.10). These results find an emergent safety issue of emotion AI reinforcing racial stereotypes through biased emotion classification. We emphasize the need for culturally and dialect-informed affective computing systems.

Paper Structure

This paper contains 47 sections, 1 equation, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Relationship between neighborhood demographics, dialect density, and predictions of anger from SpanEmo, a popular BERT-based model. Neighborhood boundaries are calculated are determined using groupings of Census tracts outlined by Mapping LAlatimesMappingLA. Demographics are approximated from Census.
  • Figure 2: Method to approximate AAVE Dialect (DDM) in a text. Sociolinguistic features are individually approximated using NLP tools like regex expressions, dependency parsing, named entity recognition and perplexity. Taking the average over normalized features yields scalar score DDM.
  • Figure 3: Over forty fine-grained emotions distilled into seven primary emotion categories. Each secondary emotion is labeled with emotion conceptualizations containing the emotion. Difficulty in grouping emotions highlights inherent tensions with categorical emotion theories.
  • Figure 4: Example Prompting Schema. Blue text indicates zero-shot (zero) prompting schema. Few-shot (few) expands to include three (text, output) pairs as shown in purple. Finally, chain-of-thought (COT) prompting schema adds reasoning steps as featured in green.
  • Figure 5: Annotator agreement measured by Cohen's Kappa averaged over pairs of annotators, stratified by group membership and individual emotion. Y-label denotes annotator subset of agreement calculation, where "In/In" is calculated between ingroup members, "In/Out" between ingroup and outgroup member pairings, and "Out/Out" between outgroup annotators. Agreement is highest for joy and anger, particularly within ingroup annotations.
  • ...and 6 more figures