Reasoning aligns language models to human cognition
Gonçalo Guiomar, Elia Torre, Pehuen Moure, Victoria Shavina, Mario Giulianelli, Shih-Chii Liu, Valerio Mante
TL;DR
This work investigates how language models decide under uncertainty and the role of chain-of-thought reasoning in aligning with human cognition. By introducing an active probabilistic reasoning task that separates evidence sampling from inference and by fitting a four-parameter mechanistic model, the authors compare humans and a broad suite of LLMs against near-optimal policies. Extended reasoning chiefly boosts inference by reducing biases and sharpening belief-to-choice mappings, placing agents in a shared cognitive space but leaving a gap in active information acquisition. The study provides a principled framework for evaluating alignment that links observable behavior to interpretable latent computations and charts directions for improving sampling efficiency in LLMs.
Abstract
Do language models make decisions under uncertainty like humans do, and what role does chain-of-thought (CoT) reasoning play in the underlying decision process? We introduce an active probabilistic reasoning task that cleanly separates sampling (actively acquiring evidence) from inference (integrating evidence toward a decision). Benchmarking humans and a broad set of contemporary large language models against near-optimal reference policies reveals a consistent pattern: extended reasoning is the key determinant of strong performance, driving large gains in inference and producing belief trajectories that become strikingly human-like, while yielding only modest improvements in active sampling. To explain these differences, we fit a mechanistic model that captures systematic deviations from optimal behavior via four interpretable latent variables: memory, strategy, choice bias, and occlusion awareness. This model places humans and models in a shared low-dimensional cognitive space, reproduces behavioral signatures across agents, and shows how chain-of-thought shifts language models toward human-like regimes of evidence accumulation and belief-to-choice mapping, tightening alignment in inference while leaving a persistent gap in information acquisition.
