Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility
Eun Cheol Choi, Lindsay E. Young, Emilio Ferrara
TL;DR
The paper investigates whether large language models (LLMs) can faithfully simulate human misinformation susceptibility by conditioning synthetic respondents on real survey profiles across three domains. It employs a multi-domain dataset and a rigorous analytic pipeline, including distributional comparisons via $JSD$ and $EMD$, correlation analyses with ground truth using Spearman’s $ ho$, and Elastic Net modeling to compare predictor effects. The findings indicate that while LLMs capture broad distributional tendencies and correlate moderately with human responses, they exaggerate the link between belief and sharing, inflate $R^2$-like predictive power, and overemphasize attitudinal/behavioral predictors at the expense of network features. Explorations of model reasoning traces and training data with CoT prompts and OLMoTrace suggest these distortions arise from biases embedded in training data and reasoning processes. Consequently, LLM-based survey simulations are better suited for diagnosing systematic divergences from human judgment than substituting for human data, particularly when social networks and relational structure are central to the research question.
Abstract
Large language models (LLMs) are increasingly used as proxies for human judgment in computational social science, yet their ability to reproduce patterns of susceptibility to misinformation remains unclear. We test whether LLM-simulated survey respondents, prompted with participant profiles drawn from social survey data measuring network, demographic, attitudinal and behavioral features, can reproduce human patterns of misinformation belief and sharing. Using three online surveys as baselines, we evaluate whether LLM outputs match observed response distributions and recover feature-outcome associations present in the original survey data. LLM-generated responses capture broad distributional tendencies and show modest correlation with human responses, but consistently overstate the association between belief and sharing. Linear models fit to simulated responses exhibit substantially higher explained variance and place disproportionate weight on attitudinal and behavioral features, while largely ignoring personal network characteristics, relative to models fit to human responses. Analyses of model-generated reasoning and LLM training data suggest that these distortions reflect systematic biases in how misinformation-related concepts are represented. Our findings suggest that LLM-based survey simulations are better suited for diagnosing systematic divergences from human judgment than for substituting it.
