Table of Contents
Fetching ...

From ChatGPT to DeepSeek: Can LLMs Simulate Humanity?

Qian Wang, Zhenheng Tang, Bingsheng He

TL;DR

This paper critically evaluates whether LLMs can faithfully simulate human society, highlighting fundamental gaps such as the absence of intrinsic motivation, inner psychological states, and diverse personal histories. It contrasts traditional simulations with LLM-based approaches, arguing that while LLMs offer cost efficiency, scalability, and emergent behaviors, they are constrained by data biases and the lack of genuine incentives. The authors propose concrete strategies to align LLM simulations with real human dynamics, including data enrichment, improved agent design, environment realism, external knowledge injection, and robust evaluation metrics. Through a cryptocurrency trading case study (CryptoTrade), the paper illustrates both the promise and limitations of LLM-driven simulations, underscoring the need for hybrid methods and careful metric design to ensure realistic and ethical insights with practical impact.

Abstract

Simulation powered by Large Language Models (LLMs) has become a promising method for exploring complex human social behaviors. However, the application of LLMs in simulations presents significant challenges, particularly regarding their capacity to accurately replicate the complexities of human behaviors and societal dynamics, as evidenced by recent studies highlighting discrepancies between simulated and real-world interactions. We rethink LLM-based simulations by emphasizing both their limitations and the necessities for advancing LLM simulations. By critically examining these challenges, we aim to offer actionable insights and strategies for enhancing the applicability of LLM simulations in human society in the future.

From ChatGPT to DeepSeek: Can LLMs Simulate Humanity?

TL;DR

This paper critically evaluates whether LLMs can faithfully simulate human society, highlighting fundamental gaps such as the absence of intrinsic motivation, inner psychological states, and diverse personal histories. It contrasts traditional simulations with LLM-based approaches, arguing that while LLMs offer cost efficiency, scalability, and emergent behaviors, they are constrained by data biases and the lack of genuine incentives. The authors propose concrete strategies to align LLM simulations with real human dynamics, including data enrichment, improved agent design, environment realism, external knowledge injection, and robust evaluation metrics. Through a cryptocurrency trading case study (CryptoTrade), the paper illustrates both the promise and limitations of LLM-driven simulations, underscoring the need for hybrid methods and careful metric design to ensure realistic and ethical insights with practical impact.

Abstract

Simulation powered by Large Language Models (LLMs) has become a promising method for exploring complex human social behaviors. However, the application of LLMs in simulations presents significant challenges, particularly regarding their capacity to accurately replicate the complexities of human behaviors and societal dynamics, as evidenced by recent studies highlighting discrepancies between simulated and real-world interactions. We rethink LLM-based simulations by emphasizing both their limitations and the necessities for advancing LLM simulations. By critically examining these challenges, we aim to offer actionable insights and strategies for enhancing the applicability of LLM simulations in human society in the future.

Paper Structure

This paper contains 22 sections, 4 figures, 1 table.

Figures (4)

  • Figure 4: Numerous biases in the training data.
  • Figure 5: Overview of the CryptoTrade Simulation.
  • Figure 6: Comparison of CryptoTrade with other trading baselines.
  • Figure 7: Reasoning process of GPT-3.5 and GPT-4.