Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
Xuanfa Jin, Ziyan Wang, Yali Du, Meng Fang, Haifeng Zhang, Jun Wang
TL;DR
This work addresses the challenge of strategic discussion in multi-agent, uncertain games by formulating ONUW as a Multi-Phase Extensive-Form Bayesian Game and proving PBEs with and without Day-phase discussion. It then introduces an RL-instructed LLM-based agent framework that learns a discrete set of discussion tactics via offline RL (Conservative Q-Learning) to influence beliefs and public speech, aiming to better approximate PBEs in ONUW. Empirical results in three- and five-player ONUW show that the learned discussion policy improves alignment with equilibria and enhances agent performance across GPT-4 and Gemini backends, with RL-trained policies outperforming direct LLM prompting. The findings highlight the importance of controllable discussion strategies in complex communication games and offer a scalable pathway for robust, belief-grounded LLM agents in uncertain, strategic environments.
Abstract
Communication is a fundamental aspect of human society, facilitating the exchange of information and beliefs among people. Despite the advancements in large language models (LLMs), recent agents built with these often neglect the control over discussion tactics, which are essential in communication scenarios and games. As a variant of the famous communication game Werewolf, One Night Ultimate Werewolf (ONUW) requires players to develop strategic discussion policies due to the potential role changes that increase the uncertainty and complexity of the game. In this work, we first present the existence of the Perfect Bayesian Equilibria (PBEs) in two scenarios of the ONUW game: one with discussion and one without. The results showcase that the discussion greatly changes players' utilities by affecting their beliefs, emphasizing the significance of discussion tactics. Based on the insights obtained from the analyses, we propose an RL-instructed language agent framework, where a discussion policy trained by reinforcement learning (RL) is employed to determine appropriate discussion tactics to adopt. Our experimental results on several ONUW game settings demonstrate the effectiveness and generalizability of our proposed framework. The project page of our paper: $\href{https://one-night-ultimate-werewolf.github.io}{one-night-ultimate-werewolf.github.io}$.
