Table of Contents
Fetching ...

Do Large Language Models Solve the Problems of Agent-Based Modeling? A Critical Review of Generative Social Simulations

Maik Larooij, Petter Törnberg

TL;DR

Do Large Language Models Solve the Problems of Agent-Based Modeling? A Critical Review of Generative Social Simulations evaluates whether LLM-driven generative ABMs address long-standing ABM criticisms, especially validation and empirical grounding. Through a systematic review of 35 papers, the authors analyze target phenomena and validation strategies, finding a heavy reliance on believability with sparse evidence of operational validity. They argue that the black-box nature and biases of LLMs can exacerbate interpretability and calibration challenges, hindering causal disentanglement and theoretical contribution. The paper concludes that, despite excitement, generative ABMs currently struggle to deliver rigorous, theory-driven insights unless robust validation frameworks and standardization are established.

Abstract

Recent advancements in AI have reinvigorated Agent-Based Models (ABMs), as the integration of Large Language Models (LLMs) has led to the emergence of ``generative ABMs'' as a novel approach to simulating social systems. While ABMs offer means to bridge micro-level interactions with macro-level patterns, they have long faced criticisms from social scientists, pointing to e.g., lack of realism, computational complexity, and challenges of calibrating and validating against empirical data. This paper reviews the generative ABM literature to assess how this new approach adequately addresses these long-standing criticisms. Our findings show that studies show limited awareness of historical debates. Validation remains poorly addressed, with many studies relying solely on subjective assessments of model `believability', and even the most rigorous validation failing to adequately evidence operational validity. We argue that there are reasons to believe that LLMs will exacerbate rather than resolve the long-standing challenges of ABMs. The black-box nature of LLMs moreover limit their usefulness for disentangling complex emergent causal mechanisms. While generative ABMs are still in a stage of early experimentation, these findings question of whether and how the field can transition to the type of rigorous modeling needed to contribute to social scientific theory.

Do Large Language Models Solve the Problems of Agent-Based Modeling? A Critical Review of Generative Social Simulations

TL;DR

Do Large Language Models Solve the Problems of Agent-Based Modeling? A Critical Review of Generative Social Simulations evaluates whether LLM-driven generative ABMs address long-standing ABM criticisms, especially validation and empirical grounding. Through a systematic review of 35 papers, the authors analyze target phenomena and validation strategies, finding a heavy reliance on believability with sparse evidence of operational validity. They argue that the black-box nature and biases of LLMs can exacerbate interpretability and calibration challenges, hindering causal disentanglement and theoretical contribution. The paper concludes that, despite excitement, generative ABMs currently struggle to deliver rigorous, theory-driven insights unless robust validation frameworks and standardization are established.

Abstract

Recent advancements in AI have reinvigorated Agent-Based Models (ABMs), as the integration of Large Language Models (LLMs) has led to the emergence of ``generative ABMs'' as a novel approach to simulating social systems. While ABMs offer means to bridge micro-level interactions with macro-level patterns, they have long faced criticisms from social scientists, pointing to e.g., lack of realism, computational complexity, and challenges of calibrating and validating against empirical data. This paper reviews the generative ABM literature to assess how this new approach adequately addresses these long-standing criticisms. Our findings show that studies show limited awareness of historical debates. Validation remains poorly addressed, with many studies relying solely on subjective assessments of model `believability', and even the most rigorous validation failing to adequately evidence operational validity. We argue that there are reasons to believe that LLMs will exacerbate rather than resolve the long-standing challenges of ABMs. The black-box nature of LLMs moreover limit their usefulness for disentangling complex emergent causal mechanisms. While generative ABMs are still in a stage of early experimentation, these findings question of whether and how the field can transition to the type of rigorous modeling needed to contribute to social scientific theory.

Paper Structure

This paper contains 12 sections, 2 tables.