Beyond Words: Evaluating and Bridging Epistemic Divergence in User-Agent Interaction via Theory of Mind
Minyuan Ruan, Ziyue Wang, Kaiming Liu, Yunghwei Lai, Peng Li, Yang Liu
TL;DR
This work reframes interaction with LLMs as a Theory of Mind (ToM) problem, formalizing a mechanism to detect and resolve epistemic divergence between a user’s subjective belief $b$ and the true environment state $s^*$. It introduces SynchToM, a four-domain benchmark with a two-stage data pipeline that generates belief–profile–state scenarios and 10-turn interaction trajectories, enabling evaluation of ToM utility in practical tasks. Across 11 models, results show ToM performance is domain-dependent and that ground-truth ToM factors significantly boost task success, while misalignment causes resolution failures; ToM reasoning is shown to be transferable and trainable via trajectory data. The authors demonstrate functional enhancements through trajectory-based reinforcement learning with ToM tokens and a training-free multi-agent collaboration setup, underscoring the practical impact of ToM for more robust, user-aligned AI agents in real-world settings.
Abstract
Large Language Models (LLMs) have developed rapidly and are widely applied to both general-purpose and professional tasks to assist human users. However, they still struggle to comprehend and respond to the true user needs when intentions and instructions are imprecisely conveyed, leading to a divergence between subjective user believes and true environment states. Resolving this epistemic divergence requires Theory of Mind (ToM), yet existing ToM evaluations for LLMs primarily focus on isolated belief inference, overlooking its functional utility in real-world interaction. To this end, we formalize ToM for LLMs as a mechanism for epistemic divergence detection and resolution, and propose a benchmark, \benchname, to assess how models reconcile user beliefs and profiles in practice. Results across 11 leading models reveal a significant limitation to identify underlying cognitive gaps that impede task success. To bridge this gap, we further curate a trajectory-based ToM dataset linking belief tracking with task-related state inference. The model trained on this data via reinforcement learning shows consistent improvement in reasoning about user mental states, leading to enhanced downstream performance. Our work highlights the practical value of ToM as an essential interaction-level mechanism rather than as a standalone reasoning skill.
