Table of Contents
Fetching ...

Large Language Models Assume People are More Rational than We Really are

Ryan Liu, Jiayi Geng, Joshua C. Peterson, Ilia Sucholutsky, Thomas L. Griffiths

TL;DR

The paper shows that state-of-the-art LLMs tend to assume humans are more rational than they actually are in both forward (predicting choices) and inverse (inferring utilities) decision-making tasks. By evaluating zero-shot and chain-of-thought prompting across multiple models on large risky-choice and preference-inference datasets, the study demonstrates a consistent bias toward rational-utility models, aligning closely with economic theories like expected value and absolute/relative utility, yet diverging from real human behavior. The results have important implications for aligning AI systems and for using LLMs to simulate humans, suggesting a needed separation between modeling human expectations and behavior and prompting careful consideration of how prompts and training shape these implicit models. The work advocates scenario-specific alignment strategies and highlights future research directions to better capture the nuances of human decision-making in AI systems.

Abstract

In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human behavior, acting how we expect humans would in everyday interactions. However, by comparing LLM behavior and predictions to a large dataset of human decisions, we find that this is actually not the case: when both simulating and predicting people's choices, a suite of cutting-edge LLMs (GPT-4o & 4-Turbo, Llama-3-8B & 70B, Claude 3 Opus) assume that people are more rational than we really are. Specifically, these models deviate from human behavior and align more closely with a classic model of rational choice -- expected value theory. Interestingly, people also tend to assume that other people are rational when interpreting their behavior. As a consequence, when we compare the inferences that LLMs and people draw from the decisions of others using another psychological dataset, we find that these inferences are highly correlated. Thus, the implicit decision-making models of LLMs appear to be aligned with the human expectation that other people will act rationally, rather than with how people actually act.

Large Language Models Assume People are More Rational than We Really are

TL;DR

The paper shows that state-of-the-art LLMs tend to assume humans are more rational than they actually are in both forward (predicting choices) and inverse (inferring utilities) decision-making tasks. By evaluating zero-shot and chain-of-thought prompting across multiple models on large risky-choice and preference-inference datasets, the study demonstrates a consistent bias toward rational-utility models, aligning closely with economic theories like expected value and absolute/relative utility, yet diverging from real human behavior. The results have important implications for aligning AI systems and for using LLMs to simulate humans, suggesting a needed separation between modeling human expectations and behavior and prompting careful consideration of how prompts and training shape these implicit models. The work advocates scenario-specific alignment strategies and highlights future research directions to better capture the nuances of human decision-making in AI systems.

Abstract

In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human behavior, acting how we expect humans would in everyday interactions. However, by comparing LLM behavior and predictions to a large dataset of human decisions, we find that this is actually not the case: when both simulating and predicting people's choices, a suite of cutting-edge LLMs (GPT-4o & 4-Turbo, Llama-3-8B & 70B, Claude 3 Opus) assume that people are more rational than we really are. Specifically, these models deviate from human behavior and align more closely with a classic model of rational choice -- expected value theory. Interestingly, people also tend to assume that other people are rational when interpreting their behavior. As a consequence, when we compare the inferences that LLMs and people draw from the decisions of others using another psychological dataset, we find that these inferences are highly correlated. Thus, the implicit decision-making models of LLMs appear to be aligned with the human expectation that other people will act rationally, rather than with how people actually act.
Paper Structure (37 sections, 4 equations, 7 figures, 18 tables)

This paper contains 37 sections, 4 equations, 7 figures, 18 tables.

Figures (7)

  • Figure 1: Two tasks we use to assess the implicit assumptions that LLMs make about human decision-making. (A) Predicting choices between gambles. Each gamble is described by the probabilities and values of different outcomes, and the goal is to predict what people will choose. (B) Inferring preferences from choices. Here, a person chooses one of many sets of objects and the goal is to infer their preferences based on that choice.
  • Figure 2: The correlations between LLMs [Llama3-8B, Llama3-70B, GPT-4-Turbo (0125-preview), GPT-4o]
  • Figure 3: Comparing GPT-4o CoT rankings (y-coordinates) to humans and four theoretical decision-making models (x-coordinates) in positive setting.
  • Figure 4: Comparing GPT-4 Turbo (0125-preview) CoT rankings (y-coordinates) to humans and four theoretical decision-making models (x-coordinates) in positive setting.
  • Figure 5: Comparing Claude 3 Opus CoT rankings (y-coordinates) to humans and four theoretical decision-making models (x-coordinates) in positive setting.
  • ...and 2 more figures