Belief Attribution as Mental Explanation: The Role of Accuracy, Informativity, and Causality
Lance Ying, Almog Hillel, Ryan Truong, Vikash K. Mansinghka, Joshua B. Tenenbaum, Tan Zhi-Xuan
TL;DR
Belief attribution asks which beliefs people privilege when explaining others' behavior. The authors propose a language-augmented Bayesian theory-of-mind (LaBToM) that quantifies explanatory strength of belief statements along three factors: accuracy, informativity, and causal relevance, derived from a probabilistic model of belief-driven action. They evaluate the model in a gridworld task where participants rank belief statements about an agent locating keys, showing that causal relevance most strongly predicts human attributions, with accuracy and informativity providing complementary signals. The work bridges belief inference and explanation generation, offering a computational account of how people select mental explanations and highlighting the role of causality in everyday theory-of-mind. This approach could inform designing AI systems that attribute or communicate beliefs in human-centric ways.
Abstract
A key feature of human theory-of-mind is the ability to attribute beliefs to other agents as mentalistic explanations for their behavior. But given the wide variety of beliefs that agents may hold about the world and the rich language we can use to express them, which specific beliefs are people inclined to attribute to others? In this paper, we investigate the hypothesis that people prefer to attribute beliefs that are good explanations for the behavior they observe. We develop a computational model that quantifies the explanatory strength of a (natural language) statement about an agent's beliefs via three factors: accuracy, informativity, and causal relevance to actions, each of which can be computed from a probabilistic generative model of belief-driven behavior. Using this model, we study the role of each factor in how people selectively attribute beliefs to other agents. We investigate this via an experiment where participants watch an agent collect keys hidden in boxes in order to reach a goal, then rank a set of statements describing the agent's beliefs about the boxes' contents. We find that accuracy and informativity perform reasonably well at predicting these rankings when combined, but that causal relevance is the single factor that best explains participants' responses.
