AnthroScore: A Computational Linguistic Measure of Anthropomorphism

Myra Cheng; Kristina Gligoric; Tiziano Piccardi; Dan Jurafsky

AnthroScore: A Computational Linguistic Measure of Anthropomorphism

Myra Cheng, Kristina Gligoric, Tiziano Piccardi, Dan Jurafsky

TL;DR

AnthroScore addresses the challenge of measuring implicit anthropomorphism in language around non-human technologies by introducing a lexicon-free metric computed from masked-language-model predictions: $A(s_x) = \log \frac{P_{\textsc{human}}(s_x)}{P_{\textsc{non-human}}(s_x)}$, with corpus-level aggregation $\bar{A}$. The method is validated against human judgments and LIWC dimensions, and applied to ~600k arXiv CS/Stat abstracts, ~55k ACL abstracts, and ~14k downstream news headlines, revealing increasing anthropomorphism in research papers, especially in language-model and multimodal domains, and higher anthropomorphism in news coverage than in papers. The findings underscore the potential for misleading metaphors in public discourse and offer domain-specific recommendations (e.g., verb choice, disclosure practices) to mitigate this framing. Overall, AnthroScore provides a scalable, lexicon-free tool for quantifying anthropomorphic framing across texts and disciplines, with broad applicability and public-interest implications.

Abstract

Anthropomorphism, or the attribution of human-like characteristics to non-human entities, has shaped conversations about the impacts and possibilities of technology. We present AnthroScore, an automatic metric of implicit anthropomorphism in language. We use a masked language model to quantify how non-human entities are implicitly framed as human by the surrounding context. We show that AnthroScore corresponds with human judgments of anthropomorphism and dimensions of anthropomorphism described in social science literature. Motivated by concerns of misleading anthropomorphism in computer science discourse, we use AnthroScore to analyze 15 years of research papers and downstream news articles. In research papers, we find that anthropomorphism has steadily increased over time, and that papers related to language models have the most anthropomorphism. Within ACL papers, temporal increases in anthropomorphism are correlated with key neural advancements. Building upon concerns of scientific misinformation in mass media, we identify higher levels of anthropomorphism in news headlines compared to the research papers they cite. Since AnthroScore is lexicon-free, it can be directly applied to a wide range of text sources.

AnthroScore: A Computational Linguistic Measure of Anthropomorphism

TL;DR

, with corpus-level aggregation

. The method is validated against human judgments and LIWC dimensions, and applied to ~600k arXiv CS/Stat abstracts, ~55k ACL abstracts, and ~14k downstream news headlines, revealing increasing anthropomorphism in research papers, especially in language-model and multimodal domains, and higher anthropomorphism in news coverage than in papers. The findings underscore the potential for misleading metaphors in public discourse and offer domain-specific recommendations (e.g., verb choice, disclosure practices) to mitigate this framing. Overall, AnthroScore provides a scalable, lexicon-free tool for quantifying anthropomorphic framing across texts and disciplines, with broad applicability and public-interest implications.

Abstract

Paper Structure (45 sections, 3 equations, 11 figures, 2 tables)

This paper contains 45 sections, 3 equations, 11 figures, 2 tables.

Introduction
Background: Anthropomorphism
Harms of Anthropomorphizing Technology
Benefits of Anthropomorphism
Metaphors are powerful.
Methods
Measuring Anthropomorphism
Interpretation
Implementation Details
Datasets
Construct Validity and Robustness
Qualitative Analyses
Correlation with Human Perception
Correlation with LIWC
Robustness
...and 30 more sections

Figures (11)

Figure 1: To measure anthropomorphism in text, AnthroScore relies on probabilities computed using a masked language model to compare how much an entity is implicitly framed as human versus non-human.
Figure 2: Anthropomorphism is most prevalent in paper abstracts about computational linguistics, and language models. Top left: Among the top 10 categories in CS/Stat arXiv, Computation and Language (cs.CL) has the highest average AnthroScore ($\bar{A}$). Top middle: LM-related papers have higher scores of $\bar{A}$ than papers that do not mention LMs. Top right: Within LM papers, LMs are much more anthropomorphized than other technical artifacts, but do not have as high of a score as human entities do. Bottom: $\bar{A}$ for top 50 most popular categories in CS/Stat arXiv. There are categories outside of CS/Stat since many papers are cross-listed with other fields. Error bars indicate 95% CI.
Figure 3: Anthropomorphism is increasing over time. In arXiv and ACL (orange and purple respectively), average AnthroScore ($\bar{A}$) has increased in the past 15 years. In ACL papers, trends correspond with key advancements in neural models (annotated). Error bars indicate 95% CI. Straight line is least-squares linear fit.
Figure 4: News headlines anthropomorphize more than paper abstracts. Anthropomorphism is more prevalent in news headlines than in research abstracts overall and for all of the top 10 arXiv categories, as well as in LM-related papers. Error bars indicate 95% CI.
Figure A1: Screenshot of interface for human annotators.
...and 6 more figures

Theorems & Definitions (1)

Definition 1

AnthroScore: A Computational Linguistic Measure of Anthropomorphism

TL;DR

Abstract

AnthroScore: A Computational Linguistic Measure of Anthropomorphism

Authors

TL;DR

Abstract

Table of Contents

Figures (11)

Theorems & Definitions (1)