My LLM might Mimic AAE -- But When Should it?
Sandra C. Sandoval, Christabel Acquaye, Kwesi Cobbina, Mohammad Nayeem Teli, Hal Daumé
TL;DR
The paper addresses how Black Americans view and want African American English (AAE) represented in AI, and whether large language models can authentically generate AAE when prompted. It combines a survey of 104 participants with an annotation task involving 228 annotators to compare LLM outputs from three prominent models against human AAE baselines from CORAAL and Twitter corpora. The results show nuanced preferences for dialect use, with formal tasks favoring Mainstream U.S. English and casual contexts allowing AAE when users choose, and findings that LLM-generated AAE can be as authentic as human speech while remaining non-offensive. These insights support broader, ethically guided inclusion of dialect diversity in AI, while underscoring the need for safeguards against offensive or mocking outputs.
Abstract
We examine the representation of African American English (AAE) in large language models (LLMs), exploring (a) the perceptions Black Americans have of how effective these technologies are at producing authentic AAE, and (b) in what contexts Black Americans find this desirable. Through both a survey of Black Americans ($n=$ 104) and annotation of LLM-produced AAE by Black Americans ($n=$ 228), we find that Black Americans favor choice and autonomy in determining when AAE is appropriate in LLM output. They tend to prefer that LLMs default to communicating in Mainstream U.S. English in formal settings, with greater interest in AAE production in less formal settings. When LLMs were appropriately prompted and provided in context examples, our participants found their outputs to have a level of AAE authenticity on par with transcripts of Black American speech. Select code and data for our project can be found here: https://github.com/smelliecat/AAEMime.git
