LLM-based Human Simulations Have Not Yet Been Reliable

Qian Wang; Jiaying Wu; Zichen Jiang; Zhenheng Tang; Bingqiao Luo; Nuo Chen; Wei Chen; Bingsheng He

LLM-based Human Simulations Have Not Yet Been Reliable

Qian Wang, Jiaying Wu, Zichen Jiang, Zhenheng Tang, Bingqiao Luo, Nuo Chen, Wei Chen, Bingsheng He

TL;DR

The paper argues that current LLM-based human simulations are not reliably representative of real human behavior due to intrinsic model biases and design flaws. It formalizes a general framework for simulations, analyzes social, economic, policy, and psychological domains, and identifies core weaknesses in cognition, memory, and validation. A systematic solution framework is proposed, emphasizing enriched data foundations, improved LLM capabilities, and rigorous multi-level validation to enhance fidelity and trustworthiness, along with an operational algorithm. The work highlights practical implications for research and applications, and provides a pathway toward more credible, human-aligned simulations with robust verification. Overall, it calls for a shift from ad-hoc performance toward verifiable reliability in LLM-driven human simulations.

Abstract

Large Language Models (LLMs) are increasingly employed for simulating human behaviors across diverse domains. However, our position is that current LLM-based human simulations remain insufficiently reliable, as evidenced by significant discrepancies between their outcomes and authentic human actions. Our investigation begins with a systematic review of LLM-based human simulations in social, economic, policy, and psychological contexts, identifying their common frameworks, recent advances, and persistent limitations. This review reveals that such discrepancies primarily stem from inherent limitations of LLMs and flaws in simulation design, both of which are examined in detail. Building on these insights, we propose a systematic solution framework that emphasizes enriching data foundations, advancing LLM capabilities, and ensuring robust simulation design to enhance reliability. Finally, we introduce a structured algorithm that operationalizes the proposed framework, aiming to guide credible and human-aligned LLM-based simulations. To facilitate further research, we provide a curated list of related literature and resources at https://github.com/Persdre/awesome-llm-human-simulation.

LLM-based Human Simulations Have Not Yet Been Reliable

TL;DR

Abstract

Paper Structure (21 sections, 2 equations, 2 figures, 2 tables)

This paper contains 21 sections, 2 equations, 2 figures, 2 tables.

Introduction
Current LLM-based Human Simulations
LLM-based Human Simulation Formulation
Social Simulation
Economic Simulation
Policy Simulation
Psychological Simulation
LLM Inherent Drawbacks
Bias
Mismatches in Simulating Cognition and Behavior
Simulation Design Drawbacks
Framework Design Drawbacks
Validation and Monitoring Drawbacks
Proposed Solutions
Enriching Data Foundation for Simulations
...and 6 more sections

Figures (2)

Figure 1: Flow of This Position Paper. We start by reviewing the current LLM-based human simulations, and then identify the causes of the gaps between simulation outputs and real-world human behavior. Finally, we propose targeted solutions for advancing the reliability of LLM-based human simulation.
Figure 2: Overview of the Proposed Solution Framework. It details three core components: (a) Enriched Data Foundations, (b) Improved LLM Capabilities, and (c) Trustworthy Simulation Design through Robust Validation.

LLM-based Human Simulations Have Not Yet Been Reliable

TL;DR

Abstract

LLM-based Human Simulations Have Not Yet Been Reliable

Authors

TL;DR

Abstract

Table of Contents

Figures (2)