Table of Contents
Fetching ...

(Ir)rationality and Cognitive Biases in Large Language Models

Olivia Macmillan-Scott, Mirco Musolesi

TL;DR

The study evaluates whether large language models display rational reasoning or human-like cognitive biases by applying classic cognitive psychology tasks to seven LLMs in a zero-shot, repeated-prompt setting. It reports substantial inconsistency and a predominance of non-human-like errors across models, with GPT-4 leading on correct-and-logical performance. The authors propose a methodological framework for benchmarking rationality in LLMs and discuss implications for safe deployment and future research. Overall, the work highlights that LLM irrationality differs from human biases and motivates systematic, task-based evaluation of AI reasoning beyond surface accuracy.

Abstract

Do large language models (LLMs) display rational reasoning? LLMs have been shown to contain human biases due to the data they have been trained on; whether this is reflected in rational reasoning remains less clear. In this paper, we answer this question by evaluating seven language models using tasks from the cognitive psychology literature. We find that, like humans, LLMs display irrationality in these tasks. However, the way this irrationality is displayed does not reflect that shown by humans. When incorrect answers are given by LLMs to these tasks, they are often incorrect in ways that differ from human-like biases. On top of this, the LLMs reveal an additional layer of irrationality in the significant inconsistency of the responses. Aside from the experimental results, this paper seeks to make a methodological contribution by showing how we can assess and compare different capabilities of these types of models, in this case with respect to rational reasoning.

(Ir)rationality and Cognitive Biases in Large Language Models

TL;DR

The study evaluates whether large language models display rational reasoning or human-like cognitive biases by applying classic cognitive psychology tasks to seven LLMs in a zero-shot, repeated-prompt setting. It reports substantial inconsistency and a predominance of non-human-like errors across models, with GPT-4 leading on correct-and-logical performance. The authors propose a methodological framework for benchmarking rationality in LLMs and discuss implications for safe deployment and future research. Overall, the work highlights that LLM irrationality differs from human biases and motivates systematic, task-based evaluation of AI reasoning beyond surface accuracy.

Abstract

Do large language models (LLMs) display rational reasoning? LLMs have been shown to contain human biases due to the data they have been trained on; whether this is reflected in rational reasoning remains less clear. In this paper, we answer this question by evaluating seven language models using tasks from the cognitive psychology literature. We find that, like humans, LLMs display irrationality in these tasks. However, the way this irrationality is displayed does not reflect that shown by humans. When incorrect answers are given by LLMs to these tasks, they are often incorrect in ways that differ from human-like biases. On top of this, the LLMs reveal an additional layer of irrationality in the significant inconsistency of the responses. Aside from the experimental results, this paper seeks to make a methodological contribution by showing how we can assess and compare different capabilities of these types of models, in this case with respect to rational reasoning.
Paper Structure (12 sections, 9 figures, 4 tables)

This paper contains 12 sections, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Default system prompt for Llama 2 7b and 13b.
  • Figure 2: Example response to the Monty Hall problem by Llama 2 7b (emphasis added).
  • Figure 3: Aggregated results across all tasks for each model. The LLMs were prompted with twelve tasks from cognitive psychology, and their responses were categorised over two dimensions: correct and human-like (in this graph, responses categorised as incorrect and non-human-like are distinguished from those that were incorrect but displayed correct reasoning). For each task, the LLMs were prompted ten times.
  • Figure 4: Example response to the Wason task (facilitated) by Bard (emphasis added).
  • Figure 5: Proportion of correct vs human-like responses across all tasks for each language model. Graph also depicts the proportion of responses which did not contain an answer or where there was a refusal to provide an answer. Correct responses include those those with correct (logical) reasoning, as well as those with incorrect (illogical) reasoning that reached the correct answer. Human-like responses include those that are correct with logical reasoning, and those that are incorrect but are achieved through a studied human cognitive bias.
  • ...and 4 more figures