Can Machines Think Like Humans? A Behavioral Evaluation of LLM Agents in Dictator Games

Ji Ma

Can Machines Think Like Humans? A Behavioral Evaluation of LLM Agents in Dictator Games

Ji Ma

TL;DR

This study interrogates whether LLM agents can exhibit human-like prosocial behavior in dictator games by manipulating sense of self and theory of mind through varied personas and framings. Across ten open-source models and a frontier GPT-4o, the authors conduct behavioral experiments and regression analyses to compare AI decisions with human baselines, revealing substantial model- and prompt-dependent variability and no consistent mapping to human behavior. Findings show that simply assigning human-like personas does not reliably induce human-like generosity, and the distribution of giving among agents is largely bimodal rather than the continuous pattern typical of humans. The work argues for a nuanced, interdisciplinary approach to evaluating prosocial AI, cautioning against treating LLMs as proxies for human decision-makers and highlighting the need for dedicated prosocial AI research to guide responsible deployment in philanthropic and social contexts.

Abstract

As Large Language Model (LLM)-based agents increasingly engage with human society, how well do we understand their prosocial behaviors? We (1) investigate how LLM agents' prosocial behaviors can be induced by different personas and benchmarked against human behaviors; and (2) introduce a social science approach to evaluate LLM agents' decision-making. We explored how different personas and experimental framings affect these AI agents' altruistic behavior in dictator games and compared their behaviors within the same LLM family, across various families, and with human behaviors. The findings reveal that merely assigning a human-like identity to LLMs does not produce human-like behaviors. These findings suggest that LLM agents' reasoning does not consistently exhibit textual markers of human decision-making in dictator games and that their alignment with human behavior varies substantially across model architectures and prompt formulations; even worse, such dependence does not follow a clear pattern. As society increasingly integrates machine intelligence, "Prosocial AI" emerges as a promising and urgent research direction in philanthropic studies.

Can Machines Think Like Humans? A Behavioral Evaluation of LLM Agents in Dictator Games

TL;DR

Abstract

Can Machines Think Like Humans? A Behavioral Evaluation of LLM Agents in Dictator Games

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)