Understanding Epistemic Language with a Language-augmented Bayesian Theory of Mind
Lance Ying, Tan Zhi-Xuan, Lionel Wong, Vikash Mansinghka, Joshua B. Tenenbaum
TL;DR
LaBToM presents a two-module framework that grounds interpretation of epistemic language in a Bayesian theory-of-mind and an epistemic language of thought (ELoT). Natural-language statements are translated into ELoT via grammar-constrained decoding and then evaluated against probabilistic inferences $\mathbf{Pr}(A, \phi)$ about agents' beliefs and goals, computed by a Bayesian ToM (BToM). The approach is validated on Doors, Keys, & Gems puzzles, showing strong correspondence with human judgments ($r$ roughly $0.76$–$0.81$) and surpassing multimodal LLM baselines, with ELoT translations and online inference enabling nuanced handling of modal, uncertain, and knowledge claims. The results highlight the importance of language-aligned epistemic representations and coherent theory-of-mind reasoning for grounded interpretation of epistemic language, with implications for both cognitive modeling and AI systems that reason about others' minds.
Abstract
How do people understand and evaluate claims about others' beliefs, even though these beliefs cannot be directly observed? In this paper, we introduce a cognitive model of epistemic language interpretation, grounded in Bayesian inferences about other agents' goals, beliefs, and intentions: a language-augmented Bayesian theory-of-mind (LaBToM). By translating natural language into an epistemic ``language-of-thought'' with grammar-constrained LLM decoding, then evaluating these translations against the inferences produced by inverting a generative model of rational action and perception, LaBToM captures graded plausibility judgments of epistemic claims. We validate our model in an experiment where participants watch an agent navigate a maze to find keys hidden in boxes needed to reach their goal, then rate sentences about the agent's beliefs. In contrast with multimodal LLMs (GPT-4o, Gemini Pro) and ablated models, our model correlates highly with human judgments for a wide range of expressions, including modal language, uncertainty expressions, knowledge claims, likelihood comparisons, and attributions of false belief.
