Large Language Models Persuade Without Planning Theory of Mind
Jared Moore, Rasmus Overmark, Ned Cooper, Beba Cibralic, Nick Haber, Cameron R. Jones
TL;DR
This study investigates Planning Theory of Mind (PToM) in adult humans and Large Language Models (LLMs) by extending a first-person interactive persuasion task (MindGames) across three experiments. Experiment 1 shows humans outperforming an LLM (o3) in a tightly controlled, PToM-heavy Hidden condition, while LLMs excel when mental states are Revealed. Experiments 2 and 3 broaden external validity by involving human targets and real target preferences, revealing that LLMs can persuade humans effectively even without explicit ToM, often outperforming humans in Revealed settings and when targets’ values can be shifted. The results argue that strong persuasion does not require human-like PToM, highlight distinct ToM styles (causal vs associative), and emphasize careful interpretation of ToM capabilities in LLMs with real-world implications for influence and policy. Overall, the work introduces a scalable, interactive framework for dissociating ToM from persuasion and invites deeper inquiry into the cognitive strategies underlying both humans and LLMs.
Abstract
A growing body of work attempts to evaluate the theory of mind (ToM) abilities of humans and large language models (LLMs) using static, non-interactive question-and-answer benchmarks. However, theoretical work in the field suggests that first-personal interaction is a crucial part of ToM and that such predictive, spectatorial tasks may fail to evaluate it. We address this gap with a novel ToM task that requires an agent to persuade a target to choose one of three policy proposals by strategically revealing information. Success depends on a persuader's sensitivity to a given target's knowledge states (what the target knows about the policies) and motivational states (how much the target values different outcomes). We varied whether these states were Revealed to persuaders or Hidden, in which case persuaders had to inquire about or infer them. In Experiment 1, participants persuaded a bot programmed to make only rational inferences. LLMs excelled in the Revealed condition but performed below chance in the Hidden condition, suggesting difficulty with the multi-step planning required to elicit and use mental state information. Humans performed moderately well in both conditions, indicating an ability to engage such planning. In Experiment 2, where a human target role-played the bot, and in Experiment 3, where we measured whether human targets' real beliefs changed, LLMs outperformed human persuaders across all conditions. These results suggest that effective persuasion can occur without explicit ToM reasoning (e.g., through rhetorical strategies) and that LLMs excel at this form of persuasion. Overall, our results caution against attributing human-like ToM to LLMs while highlighting LLMs' potential to influence people's beliefs and behavior.
