Table of Contents
Fetching ...

Large Language Models Persuade Without Planning Theory of Mind

Jared Moore, Rasmus Overmark, Ned Cooper, Beba Cibralic, Nick Haber, Cameron R. Jones

TL;DR

This study investigates Planning Theory of Mind (PToM) in adult humans and Large Language Models (LLMs) by extending a first-person interactive persuasion task (MindGames) across three experiments. Experiment 1 shows humans outperforming an LLM (o3) in a tightly controlled, PToM-heavy Hidden condition, while LLMs excel when mental states are Revealed. Experiments 2 and 3 broaden external validity by involving human targets and real target preferences, revealing that LLMs can persuade humans effectively even without explicit ToM, often outperforming humans in Revealed settings and when targets’ values can be shifted. The results argue that strong persuasion does not require human-like PToM, highlight distinct ToM styles (causal vs associative), and emphasize careful interpretation of ToM capabilities in LLMs with real-world implications for influence and policy. Overall, the work introduces a scalable, interactive framework for dissociating ToM from persuasion and invites deeper inquiry into the cognitive strategies underlying both humans and LLMs.

Abstract

A growing body of work attempts to evaluate the theory of mind (ToM) abilities of humans and large language models (LLMs) using static, non-interactive question-and-answer benchmarks. However, theoretical work in the field suggests that first-personal interaction is a crucial part of ToM and that such predictive, spectatorial tasks may fail to evaluate it. We address this gap with a novel ToM task that requires an agent to persuade a target to choose one of three policy proposals by strategically revealing information. Success depends on a persuader's sensitivity to a given target's knowledge states (what the target knows about the policies) and motivational states (how much the target values different outcomes). We varied whether these states were Revealed to persuaders or Hidden, in which case persuaders had to inquire about or infer them. In Experiment 1, participants persuaded a bot programmed to make only rational inferences. LLMs excelled in the Revealed condition but performed below chance in the Hidden condition, suggesting difficulty with the multi-step planning required to elicit and use mental state information. Humans performed moderately well in both conditions, indicating an ability to engage such planning. In Experiment 2, where a human target role-played the bot, and in Experiment 3, where we measured whether human targets' real beliefs changed, LLMs outperformed human persuaders across all conditions. These results suggest that effective persuasion can occur without explicit ToM reasoning (e.g., through rhetorical strategies) and that LLMs excel at this form of persuasion. Overall, our results caution against attributing human-like ToM to LLMs while highlighting LLMs' potential to influence people's beliefs and behavior.

Large Language Models Persuade Without Planning Theory of Mind

TL;DR

This study investigates Planning Theory of Mind (PToM) in adult humans and Large Language Models (LLMs) by extending a first-person interactive persuasion task (MindGames) across three experiments. Experiment 1 shows humans outperforming an LLM (o3) in a tightly controlled, PToM-heavy Hidden condition, while LLMs excel when mental states are Revealed. Experiments 2 and 3 broaden external validity by involving human targets and real target preferences, revealing that LLMs can persuade humans effectively even without explicit ToM, often outperforming humans in Revealed settings and when targets’ values can be shifted. The results argue that strong persuasion does not require human-like PToM, highlight distinct ToM styles (causal vs associative), and emphasize careful interpretation of ToM capabilities in LLMs with real-world implications for influence and policy. Overall, the work introduces a scalable, interactive framework for dissociating ToM from persuasion and invites deeper inquiry into the cognitive strategies underlying both humans and LLMs.

Abstract

A growing body of work attempts to evaluate the theory of mind (ToM) abilities of humans and large language models (LLMs) using static, non-interactive question-and-answer benchmarks. However, theoretical work in the field suggests that first-personal interaction is a crucial part of ToM and that such predictive, spectatorial tasks may fail to evaluate it. We address this gap with a novel ToM task that requires an agent to persuade a target to choose one of three policy proposals by strategically revealing information. Success depends on a persuader's sensitivity to a given target's knowledge states (what the target knows about the policies) and motivational states (how much the target values different outcomes). We varied whether these states were Revealed to persuaders or Hidden, in which case persuaders had to inquire about or infer them. In Experiment 1, participants persuaded a bot programmed to make only rational inferences. LLMs excelled in the Revealed condition but performed below chance in the Hidden condition, suggesting difficulty with the multi-step planning required to elicit and use mental state information. Humans performed moderately well in both conditions, indicating an ability to engage such planning. In Experiment 2, where a human target role-played the bot, and in Experiment 3, where we measured whether human targets' real beliefs changed, LLMs outperformed human persuaders across all conditions. These results suggest that effective persuasion can occur without explicit ToM reasoning (e.g., through rhetorical strategies) and that LLMs excel at this form of persuasion. Overall, our results caution against attributing human-like ToM to LLMs while highlighting LLMs' potential to influence people's beliefs and behavior.
Paper Structure (52 sections, 14 equations, 17 figures, 2 tables)

This paper contains 52 sections, 14 equations, 17 figures, 2 tables.

Figures (17)

  • Figure 1: The view a persuader () has when interacting with a target (). In the Revealed condition (shown), the persuader has access to the target's mental states in "What the other player knows" section, but the persuader does not see this in the Hidden condition. Blue messages on the right are the persuader's (). Black messages on the left are the target's (). To succeed (persuade the target) a persuader must disclose some, but not all, of the information the target is missing. In the "real persuasion" condition (E3), persuaders were not told which attributes they liked (just "You want proposal A" B or C) and in the "What the other player knows" section saw the target's answers to a few related survey questions. A demo of Experiment 1 is available at https://mindgames.psych-experiments.com. This chat depicts the beginning of a successful persuasive conversation between two humans where the target was given a value function to maximize (role-play persuasion; E2). Full dialogue in Appendix Fig. \ref{['fig:example-e2-human']}.
  • Figure 2: Example Conversations from each of the three experiments (columns) where the persuader was the LLM o3 (top row) or a human (bottom row).
  • Figure 3: In the Hidden condition of Experiment 1, which tightly measures "planning with ToM", humans outperform o3 while o3 outperforms humans in the easier Revealed condition. "Rational target success" is how often on average, at the end of the game the rational bot chose the persuader's preferred proposal. Each bar (condition) has about 200 games for n=124 human participants. In the Revealed condition, persuaders have access to the target's informational and motivational state, but in the Hidden condition, they must plan and act to gather this information (cf. Fig. \ref{['fig:policy-game']}). The scatter plots show individual-level performance (averaged from up to five games per human participant). Error bars show bootstrapped 95% confidence intervals. The grey, dashed line at $.1$ shows the random baseline.
  • Figure 4: In both the Hidden and Revealed conditions in Experiment 2, o3 outperforms human persuaders on the ability to persuade actual human participants. "Persuasion success" measures whether the human target chose the persuader's preferred proposal. (The human targets do not necessarily agree with rational bot from E1.) Each bar (condition) represents 48--64 games (human persuader: 48 Hidden, 64 Revealed; o3: 59 Hidden, 60 Revealed) for n=152 human participants. In the Revealed condition, persuaders have access to the target's informational and motivational state, but in the Hidden condition, they must plan and act to gather this information (cf. Fig. \ref{['fig:policy-game']}). The scatter plots show individual-level performance (averaged from up to five games per human participant). Error bars show bootstrapped 95% confidence intervals. The grey, dashed line at $.1$ shows the random baseline.
  • Figure 5: In Experiment 2, human targets outperform the rational bot, achieving the same utility 81% of the time and beating it about 19% of the time. Shown are the proportion of games in which the human target agreed with the rational bot ($R=T$), or did not agree with the rational bot but made a choice with as good of an outcome given the prescribed value function and available information ($U(T) = U(R)$), the rational bot made a better choice ($U(R) > U(T)$), or the actual target made a better choice ($U(T) > U(R)$). We combine the games where o3 and humans were persuaders. Error bars are 95% bootstrapped confidence intervals averaging the proportion of success across participants.
  • ...and 12 more figures