Let's Put Ourselves in Sally's Shoes: Shoes-of-Others Prefilling Improves Theory of Mind in Large Language Models
Kazutoshi Shinoda, Nobukatsu Hojo, Kyosuke Nishida, Yoshihiro Yamazaki, Keita Suzuki, Hiroaki Sugiyama, Kuniko Saito
TL;DR
The paper introduces Shoes-of-Others (SoO) prefilling, an inference-time method that prefixes LLM outputs with a perspective-taking prompt to improve Theory of Mind without fine-tuning. SoO prefilling demonstrates consistent gains across five mental-state categories on two ToM benchmarks (ToMATO and ToMBench), outperforming CoT-based and prompting baselines, and remains effective across multiple model families. Analyses show improvements arise from increased faithfulness of the model's intermediate thoughts rather than mere longer reasoning, and the benefits persist without relying on extended compute. The work connects explicit perspective-taking to enhanced ToM performance and discusses implications for ASD-related research, while acknowledging limitations in scope, potential biases, and ethical considerations.
Abstract
Recent studies have shown that Theory of Mind (ToM) in large language models (LLMs) has not reached human-level performance yet. Since fine-tuning LLMs on ToM datasets often degrades their generalization, several inference-time methods have been proposed to enhance ToM in LLMs. However, existing inference-time methods for ToM are specialized for inferring beliefs from contexts involving changes in the world state. In this study, we present a new inference-time method for ToM, Shoes-of-Others (SoO) prefilling, which makes fewer assumptions about contexts and is applicable to broader scenarios. SoO prefilling simply specifies the beginning of LLM outputs with ``Let's put ourselves in A's shoes.'', where A denotes the target character's name. We evaluate SoO prefilling on two benchmarks that assess ToM in conversational and narrative contexts without changes in the world state and find that it consistently improves ToM across five categories of mental states. Our analysis suggests that SoO prefilling elicits faithful thoughts, thereby improving the ToM performance.
