DVM: Towards Controllable LLM Agents in Social Deduction Games

Zheng Zhang; Yihuai Lan; Yangsen Chen; Lei Wang; Xiang Wang; Hao Wang

DVM: Towards Controllable LLM Agents in Social Deduction Games

Zheng Zhang, Yihuai Lan, Yangsen Chen, Lei Wang, Xiang Wang, Hao Wang

TL;DR

The paper addresses the need for controllable proficiency in LLM agents operating in social deduction games (SDGs). It introduces DVM, a three-component framework (Predictor, Decider, Discussor) trained with supervised learning and reinforcement learning, using a win-rate constrained reward and a decision-chain reward to modulate performance, e.g., $r_t = sr_t + cr$ with $cr(DC) = \alpha (WR - 0.5)$. The Predictor informs the Decider about player relations via $P_t = \text{Predictor}(D_t, V_t)$, while the Decider outputs actions through $\text{Logits}(a) = \text{Decider}(G_t, P_t, WR_{cons.})$ and $\text{Prob}(a) = \text{Softmax}(\text{Logits}(a) - a_{mask} \times 10^9)$, and the Discussor furnishes contextually relevant dialogue. Training proceeds in two steps (FanLang-9 supervised fine-tuning followed by RL with PPO for the Decider and DPO for the Predictor), with a combined reward framework and a tunable control mechanism to keep actual win rates near targeted levels. Experiments in Werewolf show DVM outperforms existing methods, achieves predefined win-rate targets, and benefits from the proposed decision-chain reward, indicating the viability of adaptive, fair, and balanced SDG agents for practical applications.

Abstract

Large Language Models (LLMs) have advanced the capability of game agents in social deduction games (SDGs). These games rely heavily on conversation-driven interactions and require agents to infer, make decisions, and express based on such information. While this progress leads to more sophisticated and strategic non-player characters (NPCs) in SDGs, there exists a need to control the proficiency of these agents. This control not only ensures that NPCs can adapt to varying difficulty levels during gameplay, but also provides insights into the safety and fairness of LLM agents. In this paper, we present DVM, a novel framework for developing controllable LLM agents for SDGs, and demonstrate its implementation on one of the most popular SDGs, Werewolf. DVM comprises three main components: Predictor, Decider, and Discussor. By integrating reinforcement learning with a win rate-constrained decision chain reward mechanism, we enable agents to dynamically adjust their gameplay proficiency to achieve specified win rates. Experiments show that DVM not only outperforms existing methods in the Werewolf game, but also successfully modulates its performance levels to meet predefined win rate targets. These results pave the way for LLM agents' adaptive and balanced gameplay in SDGs, opening new avenues for research in controllable game agents.

DVM: Towards Controllable LLM Agents in Social Deduction Games

TL;DR

with

. The Predictor informs the Decider about player relations via

, while the Decider outputs actions through

and

, and the Discussor furnishes contextually relevant dialogue. Training proceeds in two steps (FanLang-9 supervised fine-tuning followed by RL with PPO for the Decider and DPO for the Predictor), with a combined reward framework and a tunable control mechanism to keep actual win rates near targeted levels. Experiments in Werewolf show DVM outperforms existing methods, achieves predefined win-rate targets, and benefits from the proposed decision-chain reward, indicating the viability of adaptive, fair, and balanced SDG agents for practical applications.

Abstract

Paper Structure (8 sections, 12 equations, 2 figures, 3 tables)

This paper contains 8 sections, 12 equations, 2 figures, 3 tables.

Introduction
DVM
Agent Components
Training Methods
Experiments
Controllability of the Agent
Overall Performance of Agent Framework
Conclusion

Figures (2)

Figure 1: The framework of DVM. DVM consists of three parts: Predictor, Decider, and Discussor. The final reward is obtained by adding the step reward and the decision chain reward.
Figure 2: Controllability performance of agents. For each method, we applied it to control the village side in the game and added different win rate constraints. The other roles was controlled by Thinker. We conducted 30 games under each setting and measured the actual win rate for the village side.

DVM: Towards Controllable LLM Agents in Social Deduction Games

TL;DR

Abstract

DVM: Towards Controllable LLM Agents in Social Deduction Games

Authors

TL;DR

Abstract

Table of Contents

Figures (2)