Thinking beyond the anthropomorphic paradigm benefits LLM research
Lujain Ibrahim, Myra Cheng
TL;DR
The paper analyzes the prevalence of anthropomorphic language in LLM research and argues that such framing can both facilitate understanding and constrain discovery. It presents a large-scale, multi-dataset analysis to show rising anthropomorphism and introduces a five-assumption framework that spans training, alignment, evaluation, interpretation of behavior, and user interaction. For each stage, the authors provide non-anthropomorphic alternatives (e.g., byte-level tokenization, latent-space reasoning, normative specifications, dynamic evaluation, and role-play-informed interpretations) and discuss how these can broaden research directions. The work concludes with actionable recommendations to develop new metaphors, extend critical analysis beyond terminology, and incorporate cross-disciplinary perspectives, aiming to unlock novel capabilities and safer, more robust LLM development. The authors also acknowledge a pragmatic view that anthropomorphism can be natural and useful, advocating awareness rather than elimination to balance intuition with technical precision.
Abstract
Anthropomorphism, or the attribution of human traits to technology, is an automatic and unconscious response that occurs even in those with advanced technical expertise. In this position paper, we analyze hundreds of thousands of research articles to present empirical evidence of the prevalence and growth of anthropomorphic terminology in research on large language models (LLMs). We argue for challenging the deeper assumptions reflected in this terminology -- which, though often useful, may inadvertently constrain LLM development -- and broadening beyond them to open new pathways for understanding and improving LLMs. Specifically, we identify and examine five anthropomorphic assumptions that shape research across the LLM development lifecycle. For each assumption (e.g., that LLMs must use natural language for reasoning, or that they should be evaluated on benchmarks originally meant for humans), we demonstrate empirical, non-anthropomorphic alternatives that remain under-explored yet offer promising directions for LLM research and development.
