Representation Bias of Adolescents in AI: A Bilingual, Bicultural Study
Robert Wolfe, Aayushi Dangol, Bill Howe, Alexis Hiniker
TL;DR
This study interrogates how adolescents are represented in AI through English and Nepali SWEs and GLMs, using a bilingual, bicultural design that pairs model analyses with workshops from U.S. and Nepalese teens. It reveals that English-language models frequently associate adolescents with social problems and violence, while Nepali models show fewer such biases; yet both are misaligned with teens' own lived experiences. The authors argue for participatory, diversity-focused AI design to counter sensationalized portrayals and propose ethical, equitable frameworks for how AI should learn about and present adolescents. The work highlights the need for language- and culture-sensitive approaches to avoid amplifying stereotypes and to leverage AI as a tool for understanding youth perspectives rather than media-driven narratives.
Abstract
Popular and news media often portray teenagers with sensationalism, as both a risk to society and at risk from society. As AI begins to absorb some of the epistemic functions of traditional media, we study how teenagers in two countries speaking two languages: 1) are depicted by AI, and 2) how they would prefer to be depicted. Specifically, we study the biases about teenagers learned by static word embeddings (SWEs) and generative language models (GLMs), comparing these with the perspectives of adolescents living in the U.S. and Nepal. We find English-language SWEs associate teenagers with societal problems, and more than 50% of the 1,000 words most associated with teenagers in the pretrained GloVe SWE reflect such problems. Given prompts about teenagers, 30% of outputs from GPT2-XL and 29% from LLaMA-2-7B GLMs discuss societal problems, most commonly violence, but also drug use, mental illness, and sexual taboo. Nepali models, while not free of such associations, are less dominated by social problems. Data from workshops with N=13 U.S. adolescents and N=18 Nepalese adolescents show that AI presentations are disconnected from teenage life, which revolves around activities like school and friendship. Participant ratings of how well 20 trait words describe teens are decorrelated from SWE associations, with Pearson's r=.02, n.s. in English FastText and r=.06, n.s. in GloVe; and r=.06, n.s. in Nepali FastText and r=-.23, n.s. in GloVe. U.S. participants suggested AI could fairly present teens by highlighting diversity, while Nepalese participants centered positivity. Participants were optimistic that, if it learned from adolescents, rather than media sources, AI could help mitigate stereotypes. Our work offers an understanding of the ways SWEs and GLMs misrepresent a developmentally vulnerable group and provides a template for less sensationalized characterization.
