The Homogenizing Effect of Large Language Models on Human Expression and Thought
Zhivar Sourati, Alireza S. Ziabari, Morteza Dehghani
TL;DR
This article examines how large language models (LLMs) risk homogenizing human expression and thought across language, perspective, and reasoning. By synthesizing evidence from linguistics, cognitive science, and computer science, it shows that next-token prediction on biased, dominant-data corpora, combined with RLHF and broad deployment, can dampen linguistic variety, distort perspectives, and narrow reasoning styles. It outlines mechanisms (recursive user-model feedback, echo-chamber dynamics, and model-driven framing) and documents empirical patterns of reduced stylistic diversity, biased perspective representations, and homogenized reasoning. The authors call for pluralistic alignment and diversity-preserving design as essential to safeguard meaningful human diversity in the era of pervasive AI-assisted communication and cognition.
Abstract
Cognitive diversity, reflected in variations of language, perspective, and reasoning, is essential to creativity and collective intelligence. This diversity is rich and grounded in culture, history, and individual experience. Yet as large language models (LLMs) become deeply embedded in people's lives, they risk standardizing language and reasoning. This Review synthesizes evidence across linguistics, cognitive, and computer science to show how LLMs reflect and reinforce dominant styles while marginalizing alternative voices and reasoning strategies. We examine how their design and widespread use contribute to this effect by mirroring patterns in their training data and amplifying convergence as all people increasingly rely on the same models across contexts. Unchecked, this homogenization risks flattening the cognitive landscapes that drive collective intelligence and adaptability.
