AnthroScore: A Computational Linguistic Measure of Anthropomorphism
Myra Cheng, Kristina Gligoric, Tiziano Piccardi, Dan Jurafsky
TL;DR
AnthroScore addresses the challenge of measuring implicit anthropomorphism in language around non-human technologies by introducing a lexicon-free metric computed from masked-language-model predictions: $A(s_x) = \log \frac{P_{\textsc{human}}(s_x)}{P_{\textsc{non-human}}(s_x)}$, with corpus-level aggregation $\bar{A}$. The method is validated against human judgments and LIWC dimensions, and applied to ~600k arXiv CS/Stat abstracts, ~55k ACL abstracts, and ~14k downstream news headlines, revealing increasing anthropomorphism in research papers, especially in language-model and multimodal domains, and higher anthropomorphism in news coverage than in papers. The findings underscore the potential for misleading metaphors in public discourse and offer domain-specific recommendations (e.g., verb choice, disclosure practices) to mitigate this framing. Overall, AnthroScore provides a scalable, lexicon-free tool for quantifying anthropomorphic framing across texts and disciplines, with broad applicability and public-interest implications.
Abstract
Anthropomorphism, or the attribution of human-like characteristics to non-human entities, has shaped conversations about the impacts and possibilities of technology. We present AnthroScore, an automatic metric of implicit anthropomorphism in language. We use a masked language model to quantify how non-human entities are implicitly framed as human by the surrounding context. We show that AnthroScore corresponds with human judgments of anthropomorphism and dimensions of anthropomorphism described in social science literature. Motivated by concerns of misleading anthropomorphism in computer science discourse, we use AnthroScore to analyze 15 years of research papers and downstream news articles. In research papers, we find that anthropomorphism has steadily increased over time, and that papers related to language models have the most anthropomorphism. Within ACL papers, temporal increases in anthropomorphism are correlated with key neural advancements. Building upon concerns of scientific misinformation in mass media, we identify higher levels of anthropomorphism in news headlines compared to the research papers they cite. Since AnthroScore is lexicon-free, it can be directly applied to a wide range of text sources.
