Table of Contents
Fetching ...

Persona-Based Simulation of Human Opinion at Population Scale

Mao Li, Frederick G. Conrad

Abstract

What does it mean to model a person, not merely to predict isolated responses, preferences, or behaviors, but to simulate how an individual interprets events, forms opinions, makes judgments, and acts consistently across contexts? This question matters because social science requires not only observing and predicting human outcomes, but also simulating interventions and their consequences. Although large language models (LLMs) can generate human-like answers, most existing approaches remain predictive, relying on demographic correlations rather than representations of individuals themselves. We introduce SPIRIT (Semi-structured Persona Inference and Reasoning for Individualized Trajectories), a framework designed explicitly for simulation rather than prediction. SPIRIT infers psychologically grounded, semi-structured personas from public social media posts, integrating structured attributes (e.g., personality traits and world beliefs) with unstructured narrative text reflecting values and lived experience. These personas prompt LLM-based agents to act as specific individuals when answering survey questions or responding to events. Using the Ipsos KnowledgePanel, a nationally representative probability sample of U.S. adults, we show that SPIRIT-conditioned simulations recover self-reported responses more faithfully than demographic persona and reproduce human-like heterogeneity in response patterns. We further demonstrate that persona banks can function as virtual respondent panels for studying both stable attitudes and time-sensitive public opinion.

Persona-Based Simulation of Human Opinion at Population Scale

Abstract

What does it mean to model a person, not merely to predict isolated responses, preferences, or behaviors, but to simulate how an individual interprets events, forms opinions, makes judgments, and acts consistently across contexts? This question matters because social science requires not only observing and predicting human outcomes, but also simulating interventions and their consequences. Although large language models (LLMs) can generate human-like answers, most existing approaches remain predictive, relying on demographic correlations rather than representations of individuals themselves. We introduce SPIRIT (Semi-structured Persona Inference and Reasoning for Individualized Trajectories), a framework designed explicitly for simulation rather than prediction. SPIRIT infers psychologically grounded, semi-structured personas from public social media posts, integrating structured attributes (e.g., personality traits and world beliefs) with unstructured narrative text reflecting values and lived experience. These personas prompt LLM-based agents to act as specific individuals when answering survey questions or responding to events. Using the Ipsos KnowledgePanel, a nationally representative probability sample of U.S. adults, we show that SPIRIT-conditioned simulations recover self-reported responses more faithfully than demographic persona and reproduce human-like heterogeneity in response patterns. We further demonstrate that persona banks can function as virtual respondent panels for studying both stable attitudes and time-sensitive public opinion.

Paper Structure

This paper contains 79 sections, 5 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: Framework evaluation across models and conditioning strategies. A, Distribution of per-user position-weighted mean inferred values for the same eligible Ipsos KnowledgePanel participants linked to public social media accounts, comparing their self-reported responses with simulated responses generated under the demographic persona and the SPIRIT persona (with non-demographic attributes inferred from text). For each participant, responses are aggregated across survey items using a position-weighted mean, such that identical composite values arise only from identical response patterns. Human responses are shown as an empirical reference for the level of population-level heterogeneity expected when aggregating across many items. Demographic personas yield highly concentrated distributions, whereas SPIRIT personas preserve substantially greater individual-level variation, closely resembling human heterogeneity. B, User-level inference accuracy across models ordered by size. SPIRIT personas consistently outperform demographic personas with performance gains saturating for larger models.
  • Figure 2: persona-bank responses compared with polling benchmarks, grouped by question type. A, Long-term attitudinal questions (abortion and immigration) drawn from general opinion surveys. B, Event-sensitive questions (Epstein files and Venezuela policy attitudes) fielded in late 2025 to early 2026, for which simulated respondents may require contemporaneous context. Shaded regions indicate issue-specific clusters and are used to avoid implying continuity across unrelated issues. Within each issue-specific cluster, persona-bank estimates reproduce coherent question-to-question patterns that align with polling benchmarks. After calibration, Twitter-based estimates track benchmarks more closely in absolute level than Reddit-based estimates, which exhibit systematic shifts in magnitude. Both Twitter- and Reddit-based estimates preserve the same directional structure.
  • Figure 3: Overview of the SPIRIT framework. A probability-based sample from the Ipsos KnowledgePanel is linked to respondents’ social media accounts, and their historical posts are collected to infer structured user personas with a painter model. These inferred personas form a persona bank that serves as digital twins for Stage 2 reasoning, where a reasoner model simulates responses to downstream tasks such as survey items. The simulated responses are then weighted to produce U.S. population-level estimates.
  • Figure 4: Diagnostic analyses of response confidence and diversity for GPT-5-mini. A, Distribution of response-level confidence categories under demographic persona and SPIRIT persona conditioning. B, Distribution of response entropy per question, where lower entropy indicates more fixed or biased response tendencies. C, Relationship between user-level accuracy and the number of low-confidence responses per user, shown separately for each condition with linear trend lines.
  • Figure 5: Relationship between social media trace quality and prediction accuracy. (A) Accuracy as a function of the log-transformed total number of characters across all observed posts for each individual. (B) Accuracy as a function of the number of low-confidence persona attributes inferred by SPIRIT. Points represent individual users, colored by platform (Twitter vs. Reddit). Solid lines indicate linear trends.
  • ...and 1 more figures