Table of Contents
Fetching ...

AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

Zhining Zhang, Chuanyang Jin, Mung Yao Jia, Shunchi Zhang, Tianmin Shu

TL;DR

AutoToM presents a fully automated, model-based Theory of Mind framework that uses Bayesian inverse planning with an LLM backend and automated agent model discovery to infer any mental state across diverse domains. By jointly learning suitable agent models, timesteps, and hypotheses, the approach achieves strong performance across five ToM benchmarks and cognitive studies, outperforming prompting-based LLMs and prior model-based methods. The framework also yields human-like confidence estimates and supports online, embodied decision-making, demonstrating practical applicability for interactive AI systems. Overall, AutoToM offers a scalable, robust, and interpretable pathway toward cognitively grounded machine Theory of Mind.

Abstract

Theory of Mind (ToM), the ability to understand people's minds based on their behavior, is key to developing socially intelligent agents. Current approaches to ToM reasoning either rely on prompting Large Language Models (LLMs), which are prone to systematic errors, or use handcrafted, rigid agent models for model-based inference, which are more robust but fail to generalize across domains. In this work, we introduce AutoToM, an automated agent modeling method for scalable, robust, and interpretable mental inference. Given a ToM problem, AutoToM first proposes an initial agent model and then performs automated Bayesian inverse planning based on this model, leveraging an LLM backend. Guided by inference uncertainty, it iteratively refines the model by introducing additional mental variables and/or incorporating more timesteps in the context. Across five diverse benchmarks, AutoToM outperforms existing ToM methods and even large reasoning models. Additionally, we show that AutoToM can produce human-like confidence estimates and enable online mental inference for embodied decision-making.

AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

TL;DR

AutoToM presents a fully automated, model-based Theory of Mind framework that uses Bayesian inverse planning with an LLM backend and automated agent model discovery to infer any mental state across diverse domains. By jointly learning suitable agent models, timesteps, and hypotheses, the approach achieves strong performance across five ToM benchmarks and cognitive studies, outperforming prompting-based LLMs and prior model-based methods. The framework also yields human-like confidence estimates and supports online, embodied decision-making, demonstrating practical applicability for interactive AI systems. Overall, AutoToM offers a scalable, robust, and interpretable pathway toward cognitively grounded machine Theory of Mind.

Abstract

Theory of Mind (ToM), the ability to understand people's minds based on their behavior, is key to developing socially intelligent agents. Current approaches to ToM reasoning either rely on prompting Large Language Models (LLMs), which are prone to systematic errors, or use handcrafted, rigid agent models for model-based inference, which are more robust but fail to generalize across domains. In this work, we introduce AutoToM, an automated agent modeling method for scalable, robust, and interpretable mental inference. Given a ToM problem, AutoToM first proposes an initial agent model and then performs automated Bayesian inverse planning based on this model, leveraging an LLM backend. Guided by inference uncertainty, it iteratively refines the model by introducing additional mental variables and/or incorporating more timesteps in the context. Across five diverse benchmarks, AutoToM outperforms existing ToM methods and even large reasoning models. Additionally, we show that AutoToM can produce human-like confidence estimates and enable online mental inference for embodied decision-making.

Paper Structure

This paper contains 47 sections, 7 equations, 11 figures, 15 tables, 2 algorithms.

Figures (11)

  • Figure 1: An overview of AutoToM. $X^{t_s:t}$ are observable variables, $V^{t_s:t}$ are latent mental variables, and $q$ is the query (in this case, a mental variable $v_i^t \in V^{t}$). $t_s:t$ denotes timesteps from $t_s$ to $t$ in the context that are considered for inference. Variables $s^t, o^t, b^t, a^t, g^t$ represent state, observation, belief, action, and goal, respectively, with solid arrows indicating dependencies defined in the models. Given a question, we extract the observable variables (information extraction) and propose an initial agent model. This is followed by automated Bayesian inverse planning and iterative model adjustment. When the model utility is high enough, we will produce the final answer based on the inference result.
  • Figure 2: Overview of AutoToM's capacities and applications evaluated in this work. (a) Example questions (top panels) and the necessary agent model for model-based inference (bottom panels) in diverse Theory of Mind benchmarks. Questions in these benchmarks encompass different mental variables, contexts, numbers of agents, the presence or absence of utterances, wording styles, and modalities. (b) AutoToM can produce human-like confidence estimation in classic cognitive studies. (c) AutoToM can also be used for online goal inference to enhance embodied assistance, where it sequentially updates the inference of a main agent's goal to inform a helper agent's assistance.
  • Figure 3: (a) Given an agent model, AutoToM samples hypotheses for each latent variable ($o^t$ and $b^t$ in this example), remove spurious hypotheses, and conduct Bayesian inference based on estimated local conditionals. (b) Given any ToM inference problem, AutoToM refines the agent model by alternating between variable adjustment (introducing belief in this example) and timestep adjustment.
  • Figure 4: Comparison of AutoToM and large reasoning models across various conditions (summarized among all benchmarks): (a) question types, (b) context length, (c) the number of agents, and (d) the level of recursion. Note that "Level 1 Action" refers to Forward Action inference in BigToM, and "Level 2 Goal" refers to the Belief of Goal inference in MuMA-ToM.
  • Figure 5: A qualitative example of AutoToM's model adjustment and inference process in a false-belief scenario from BigToM gandhi2024understanding. We show the results from each key model step. It demonstrates how AutoToM adjusts the agent model to increase inference confidence.
  • ...and 6 more figures