Centaur: a foundation model of human cognition

Marcel Binz; Elif Akata; Matthias Bethge; Franziska Brändle; Fred Callaway; Julian Coda-Forno; Peter Dayan; Can Demircan; Maria K. Eckstein; Noémi Éltető; Thomas L. Griffiths; Susanne Haridi; Akshay K. Jagadish; Li Ji-An; Alexander Kipnis; Sreejan Kumar; Tobias Ludwig; Marvin Mathony; Marcelo Mattar; Alireza Modirshanechi; Surabhi S. Nath; Joshua C. Peterson; Milena Rmus; Evan M. Russek; Tankred Saanum; Johannes A. Schubert; Luca M. Schulze Buschoff; Nishad Singhi; Xin Sui; Mirko Thalmann; Fabian Theis; Vuong Truong; Vishaal Udandarao; Konstantinos Voudouris; Robert Wilson; Kristin Witte; Shuchen Wu; Dirk Wulff; Huadong Xiong; Eric Schulz

Centaur: a foundation model of human cognition

Marcel Binz, Elif Akata, Matthias Bethge, Franziska Brändle, Fred Callaway, Julian Coda-Forno, Peter Dayan, Can Demircan, Maria K. Eckstein, Noémi Éltető, Thomas L. Griffiths, Susanne Haridi, Akshay K. Jagadish, Li Ji-An, Alexander Kipnis, Sreejan Kumar, Tobias Ludwig, Marvin Mathony, Marcelo Mattar, Alireza Modirshanechi, Surabhi S. Nath, Joshua C. Peterson, Milena Rmus, Evan M. Russek, Tankred Saanum, Johannes A. Schubert, Luca M. Schulze Buschoff, Nishad Singhi, Xin Sui, Mirko Thalmann, Fabian Theis, Vuong Truong, Vishaal Udandarao, Konstantinos Voudouris, Robert Wilson, Kristin Witte, Shuchen Wu, Dirk Wulff, Huadong Xiong, Eric Schulz

TL;DR

Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language, is introduced, and it is found that the model's internal representations become more aligned with human neural activity after finetuning.

Abstract

Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. A first step in this direction is to create a model that can predict human behavior in a wide range of settings. Here we introduce Centaur, a computational model that can predict and simulate human behavior in any experiment expressible in natural language. We derived Centaur by finetuning a state-of-the-art language model on a novel, large-scale data set called Psych-101. Psych-101 reaches an unprecedented scale, covering trial-by-trial data from over 60,000 participants performing over 10,000,000 choices in 160 experiments. Centaur not only captures the behavior of held-out participants better than existing cognitive models, but also generalizes to new cover stories, structural task modifications, and entirely new domains. Furthermore, we find that the model's internal representations become more aligned with human neural activity after finetuning. Taken together, our results demonstrate that it is possible to discover computational models that capture human behavior across a wide range of domains. We believe that such models provide tremendous potential for guiding the development of cognitive theories and present a case study to demonstrate this.

Centaur: a foundation model of human cognition

TL;DR

Abstract

Paper Structure

This paper contains 113 sections, 20 equations, 15 figures, 2 tables.

Figures (15)

Figure 1: Psych-101 and Centaur overview. a, Psych-101 comprises of trial-by-trial data from 160 psychological experiments and 60,092 participants, making 10,681,650 choices in total. It contains domains such as multi-armed bandits, decision-making, memory, supervised learning, Markov decision processes, and others (shown examples are stylized and abbreviated for readability). b, Centaur is a foundation of model human cognition that is obtained by adding low-rank adapters to a state-of-the-art language model and finetuning it on Psych-101.
Figure 2: Goodness-of-fit on Psych-101. a, Difference in log-likelihood of Centaur and Llama relative to a domain-specific cognitive model for each experiment. A value of zero corresponds to the goodness-of-fit of the domain-specific cognitive model while a value above zero indicates improved goodness-of-fit to human responses. Error bars correspond to the standard error of the mean, taken over responses. Centaur outperforms both Llama and a collection of domain-specific cognitive models in almost every experiment. Note that we only included experiments for which we have implemented a domain-specific cognitive model in this graphic and merged different studies using the same paradigm. A full table for all experiments can be found in the Supplementary Information. b, Model simulations on the horizon task. The plot visualizes probability densities over reward and an information bonus parameter for both people and simulated runs of Centaur. c, Model simulations on the two-step task. The plot visualizes probability densities over reward and a parameter indicating how model-based learning was for both people and simulated runs of Centaur. d, Model simulations on a social prediction game. The plot visualizes probability densities over accuracies of predicting human strategies and strategies of an artificial agent with matched statistics for both people and simulated runs of Centaur.
Figure 3: Evaluation in different held-out settings. a, Negative log-likelihoods for the two-step task with a modified cover story feher2020humans. b, Negative log-likelihoods for a three-armed bandit experiment dubois2022value. c, Negative log-likelihoods for an experiment probing logical reasoning jansen2021rational with items based on the Law School Admission Test (LSAT). Centaur outperforms both Llama and domain-specific cognitive models when faced with modified cover stories, problem structures, and entirely novel domains. Error bars correspond to the standard error of the mean, taken over responses.
Figure 4: Human alignment. a, Multidimensional scaling embedding of the ten behavioral metrics in CogBench pmlr-v235-coda-forno24a for different models. b, Pearson correlation coefficients indicating how well human neural activity in the two-step task feher2023rethinking can be decoded using Centaur's internal representations extracted from different layers. c, Pearson correlation coefficients indicating how well human neural activity in a sentence-reading task tuckute2024driving can be decoded using Centaur's internal representations extracted from different layers. Control refers to a model that uses representations extracted from a randomly-initialized transformer model with matched architecture.
Figure 5: Model-guided scientific discovery. We used Psych-101 and Centaur to guide the development of a cognitive model for a multi-attribute decision-making study. Each panel shows the Akaike information criterion (AIC) for the set of models considered at the given stage, starting with the models considered in the original study. We first asked DeepSeek-R1 to generate an explanation for human responses and formalized the resulting verbal strategy into a formal computational model. We then further refined this model through scientific regret minimization using Centaur as a reference model. Eight data points are visualized for which Centaur makes accurate predictions but the DeepSeek-R1-discovered model does not. We then used this information to design a domain-specific cognitive model that is as predictive as Centaur yet still interpretable.
...and 10 more figures