Table of Contents
Fetching ...

Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning

Thuy Ngoc Nguyen, Kasturi Jamale, Cleotilde Gonzalez

TL;DR

This work investigates whether open-source LLMs and a cognitive IBL model can predict and explain human action decisions in two sequential gridworld decision tasks requiring exploitation–exploration trade-offs and delayed feedback. By comparing Mistral-7B and Llama-3 70B against an ACT-R–based IBL model across full and restricted information conditions, the study shows that Mistral-7B best predicts human strategies and quickly incorporates new demonstrations, while IBL captures initial exploratory behavior and risk aversion under limited information. The results suggest a complementary dynamic: LLMs excel with abundant demonstrations and context, whereas cognitive models effectively reflect early human exploration and loss aversion. The findings motivate integrating LLMs with cognitive architectures to improve modeling and understanding of complex human decision-making in AI-assisted systems. Practically, this points toward synergistic human-AI decision support that leverages rapid learning from LLMs and cognitively grounded explanations from IBL models.

Abstract

Large Language Models (LLMs) have demonstrated their capabilities across various tasks, from language translation to complex reasoning. Understanding and predicting human behavior and biases are crucial for artificial intelligence (AI) assisted systems to provide useful assistance, yet it remains an open question whether these models can achieve this. This paper addresses this gap by leveraging the reasoning and generative capabilities of the LLMs to predict human behavior in two sequential decision-making tasks. These tasks involve balancing between exploitative and exploratory actions and handling delayed feedback, both essential for simulating real-life decision processes. We compare the performance of LLMs with a cognitive instance-based learning (IBL) model, which imitates human experiential decision-making. Our findings indicate that LLMs excel at rapidly incorporating feedback to enhance prediction accuracy. In contrast, the cognitive IBL model better accounts for human exploratory behaviors and effectively captures loss aversion bias, i.e., the tendency to choose a sub-optimal goal with fewer step-cost penalties rather than exploring to find the optimal choice, even with limited experience. The results highlight the benefits of integrating LLMs with cognitive architectures, suggesting that this synergy could enhance the modeling and understanding of complex human decision-making patterns.

Predicting and Understanding Human Action Decisions: Insights from Large Language Models and Cognitive Instance-Based Learning

TL;DR

This work investigates whether open-source LLMs and a cognitive IBL model can predict and explain human action decisions in two sequential gridworld decision tasks requiring exploitation–exploration trade-offs and delayed feedback. By comparing Mistral-7B and Llama-3 70B against an ACT-R–based IBL model across full and restricted information conditions, the study shows that Mistral-7B best predicts human strategies and quickly incorporates new demonstrations, while IBL captures initial exploratory behavior and risk aversion under limited information. The results suggest a complementary dynamic: LLMs excel with abundant demonstrations and context, whereas cognitive models effectively reflect early human exploration and loss aversion. The findings motivate integrating LLMs with cognitive architectures to improve modeling and understanding of complex human decision-making in AI-assisted systems. Practically, this points toward synergistic human-AI decision support that leverages rapid learning from LLMs and cognitively grounded explanations from IBL models.

Abstract

Large Language Models (LLMs) have demonstrated their capabilities across various tasks, from language translation to complex reasoning. Understanding and predicting human behavior and biases are crucial for artificial intelligence (AI) assisted systems to provide useful assistance, yet it remains an open question whether these models can achieve this. This paper addresses this gap by leveraging the reasoning and generative capabilities of the LLMs to predict human behavior in two sequential decision-making tasks. These tasks involve balancing between exploitative and exploratory actions and handling delayed feedback, both essential for simulating real-life decision processes. We compare the performance of LLMs with a cognitive instance-based learning (IBL) model, which imitates human experiential decision-making. Our findings indicate that LLMs excel at rapidly incorporating feedback to enhance prediction accuracy. In contrast, the cognitive IBL model better accounts for human exploratory behaviors and effectively captures loss aversion bias, i.e., the tendency to choose a sub-optimal goal with fewer step-cost penalties rather than exploring to find the optimal choice, even with limited experience. The results highlight the benefits of integrating LLMs with cognitive architectures, suggesting that this synergy could enhance the modeling and understanding of complex human decision-making patterns.
Paper Structure (28 sections, 3 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 28 sections, 3 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: An overview of the experiment design.
  • Figure 2: Example grids for simple and complex conditions:(a) "green" is the highest value target with "orange" as the distractor ($\Delta_d = 1$); (b) "orange" is the target, "blue" is the distractor ($\Delta_d = 4$).
  • Figure 3: Average KL divergence per episode for all models in both conditions of Experiments 1 and 2, with shaded areas indicating standard error at 95% confidence intervals. Lower KL divergence suggests better alignment.
  • Figure 4: Average prediction accuracy per episode for all models in both conditions of Experiments 1 and 2. Shaded areas indicate the standard error. Higher prediction accuracy suggests better alignment with human target consumption.
  • Figure 5: Average entropy difference for all models in both conditions of Experiment 1 and 2. Positive values indicate more exploration than humans; negative values indicate less and near-zero values show alignment with human behavior.