Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities
Alex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency
TL;DR
This work tackles the challenge of enabling theory-of-mind reasoning in large language models. It introduces SimToM, a two-stage prompting framework inspired by Simulation Theory that first performs perspective-taking to filter context, then answers questions from that perspective, requiring no model fine-tuning. Across ToMI and BigTOM benchmarks, SimToM yields substantial improvements over 0-shot and single-pass prompting, with domain-specific prompts and oracle perspectives providing further gains. The findings highlight perspective-taking as a core lever for enhancing LLM ToM capabilities and point to promising directions for future research and application in socially aware AI systems.
Abstract
Human interactions are deeply rooted in the interplay of thoughts, beliefs, and desires made possible by Theory of Mind (ToM): our cognitive ability to understand the mental states of ourselves and others. Although ToM may come naturally to us, emulating it presents a challenge to even the most advanced Large Language Models (LLMs). Recent improvements to LLMs' reasoning capabilities from simple yet effective prompting techniques such as Chain-of-Thought have seen limited applicability to ToM. In this paper, we turn to the prominent cognitive science theory "Simulation Theory" to bridge this gap. We introduce SimToM, a novel two-stage prompting framework inspired by Simulation Theory's notion of perspective-taking. To implement this idea on current ToM benchmarks, SimToM first filters context based on what the character in question knows before answering a question about their mental state. Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods, and our analysis reveals the importance of perspective-taking to Theory-of-Mind capabilities. Our findings suggest perspective-taking as a promising direction for future research into improving LLMs' ToM capabilities.
