Leading the Follower: Learning Persuasive Agents in Social Deduction Games
Zhang Zheng, Deheng Ye, Peilin Zhao, Hao Wang
TL;DR
The paper proposes a Stackelberg-based framework for turn-based dialogue in social deduction games, framing each speaking turn as a leader-follower interaction where the leader optimizes utterances to steer the follower's responses. It develops an RL training pipeline that uses an API-based backend to generate base utterances and an open-source Refiner to maximize persuasive impact, guided by a Measurer that estimates follower response probabilities; GRPO drives the refinement without requiring explicit human preference data. Across Werewolf, Avalon, and ONUW, the approach yields consistent gains over strong baselines and generalizes across different backend LLMs, indicating robust, model-agnostic persuasive capability. The work demonstrates a principled method to imbue AI agents with strategic social influence, with potential applications in any domain requiring persuasive, multi-turn communication under uncertainty.
Abstract
Large language model (LLM) agents have shown remarkable progress in social deduction games (SDGs). However, existing approaches primarily focus on information processing and strategy selection, overlooking the significance of persuasive communication in influencing other players' beliefs and responses. In SDGs, success depends not only on making correct deductions but on convincing others to response in alignment with one's intent. To address this limitation, we formalize turn-based dialogue in SDGs as a Stackelberg competition, where the current player acts as the leader who strategically influences the follower's response. Building on this theoretical foundation, we propose a reinforcement learning framework that trains agents to optimize utterances for persuasive impact. Through comprehensive experiments across three diverse SDGs, we demonstrate that our agents significantly outperform baselines. This work represents a significant step toward developing AI agents capable of strategic social influence, with implications extending to scenarios requiring persuasive communication.
