Table of Contents
Fetching ...

Generative Agent Simulations of 1,000 People

Joon Sung Park, Carolyn Q. Zou, Aaron Shaw, Benjamin Mako Hill, Carrie Cai, Meredith Ringel Morris, Robb Willer, Percy Liang, Michael S. Bernstein

TL;DR

This work demonstrates that generative agents can faithfully simulate the attitudes and behaviors of over a thousand real individuals by grounding agents in two-hour AI-conducted interviews. By integrating interview transcripts with memory streams and expert reflections in a retrieval-augmented prompting framework, the authors achieve high fidelity to source participants across the General Social Survey core, Big Five personality traits, economic games, and replication experiments. Interview-informed agents outperform baselines and exhibit reduced demographic bias, validating a methodology for individual-level behavioral simulation and creating an accessible Agent Bank for future social science research. The study also lays out privacy-conscious access strategies and governance for sharing agent data, offering a scalable foundation for policy testing and theory development using AI-driven social simulations.

Abstract

The promise of human behavioral simulation--general-purpose computational agents that replicate human behavior across domains--could enable broad applications in policymaking and social science. We present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals--applying large language models to qualitative interviews about their lives, then measuring how well these agents replicate the attitudes and behaviors of the individuals that they represent. The generative agents replicate participants' responses on the General Social Survey 85% as accurately as participants replicate their own answers two weeks later, and perform comparably in predicting personality traits and outcomes in experimental replications. Our architecture reduces accuracy biases across racial and ideological groups compared to agents given demographic descriptions. This work provides a foundation for new tools that can help investigate individual and collective behavior.

Generative Agent Simulations of 1,000 People

TL;DR

This work demonstrates that generative agents can faithfully simulate the attitudes and behaviors of over a thousand real individuals by grounding agents in two-hour AI-conducted interviews. By integrating interview transcripts with memory streams and expert reflections in a retrieval-augmented prompting framework, the authors achieve high fidelity to source participants across the General Social Survey core, Big Five personality traits, economic games, and replication experiments. Interview-informed agents outperform baselines and exhibit reduced demographic bias, validating a methodology for individual-level behavioral simulation and creating an accessible Agent Bank for future social science research. The study also lays out privacy-conscious access strategies and governance for sharing agent data, offering a scalable foundation for policy testing and theory development using AI-driven social simulations.

Abstract

The promise of human behavioral simulation--general-purpose computational agents that replicate human behavior across domains--could enable broad applications in policymaking and social science. We present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals--applying large language models to qualitative interviews about their lives, then measuring how well these agents replicate the attitudes and behaviors of the individuals that they represent. The generative agents replicate participants' responses on the General Social Survey 85% as accurately as participants replicate their own answers two weeks later, and perform comparably in predicting personality traits and outcomes in experimental replications. Our architecture reduces accuracy biases across racial and ideological groups compared to agents given demographic descriptions. This work provides a foundation for new tools that can help investigate individual and collective behavior.

Paper Structure

This paper contains 8 sections, 1 equation, 8 figures, 15 tables.

Figures (8)

  • Figure 1: The process of collecting participant data and creating generative agents begins by recruiting a stratified sample of 1,052 individuals from the U.S., selected based on age, census division, education, ethnicity, gender, income, neighborhood, political ideology, and sexual identity. Once recruited, participants complete a two-hour audio interview with our AI interviewer, followed by surveys and experiments. We create generative agents for each participant using their interview data. To evaluate these agents, both the generative agents and participants complete the same surveys and experiments. For the human participants, this involves retaking the surveys and experiments again two weeks later. We assess the accuracy of the agents by comparing agent responses to the participants' original responses, normalizing by how consistently each participant successfully replicates their own responses two weeks later.
  • Figure 2: Generative agents' predictive performance, and$95 \%$ confidence intervals. The consistency rate between participants and the predictive performance of generative agents is evaluated across various constructs and averaged across individuals. For the General Social Survey (GSS), accuracy is reported due to its categorical response types, while the Big Five personality traits and economic games report mean absolute error (MAE) due to their numerical response types. Correlation is reported for all constructs. Normalized accuracy is provided for all metrics, except for MAE, which cannot be calculated for individuals whose MAE is 0 (i.e., those who responded the same way in both phases). We find that generative agents predict participants' behavior and attitudes well, especially when compared to participants' own rate of internal consistency. Additionally, using interviews to inform agent behavior significantly improves the predictive performance of agents for both GSS and Big Five constructs, outperforming other commonly used methods in the literature.
  • Figure 3: Demographic Parity Difference (DPD) for generative agents across political ideology, race, and gender subgroups on three tasks: GSS (in percentages), Big Five, and economic games (in correlation coefficients). DPD represents the performance disparity between the most and least favored groups within each demographic category. Generative agents using interviews consistently show lower DPDs compared to those using demographic information or persona descriptions, suggesting that interview-based generative agents mitigate bias more effectively across all tasks. Gender-based DPDs remain relatively low and consistent across all conditions.
  • Figure 4: The study platform and interface. Once recruited, our participants are routed to our custom-built platform. The interface includes several components: a) Participant sign-up page: Participants sign up with an ID and password of their choice. b) Avatar creator: Participants consent and create a 2-D sprite avatar to represent them in the study platform. c) Main interface displaying the study components: The modules include: 1) study consent, 2) avatar creation, 3) interview, 4) surveys and experiments, 5) self-consistency retake of the surveys and experiments. The modules only become available in order; the button to start a module becomes clickable once the participants have completed all previous modules. The self-consistency survey and experiment module only becomes available two weeks after the participants have completed the previous modules.
  • Figure 5: The architecture of the interviewer agent. It takes as input the participants' utterances and the interview script, generating the next action in the form of follow-up questions or deciding to move on to the next question module using a language model. A reflection module helps the architecture succinctly summarize and infer insights from the ongoing interview, enabling the agent to more effectively generate follow-up questions.
  • ...and 3 more figures