PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games
Qinglin Zhu, Runcong Zhao, Bin Liang, Jinhua Du, Lin Gui, Yulan He
TL;DR
This work tackles the challenge of enabling LLM-based agents to reason and interact effectively in Murder Mystery Games by introducing the WellPlay dataset and the PLAYER* framework. WellPlay provides a rigorous benchmark of 1,482 inferential questions across 12 MMGs to assess objective, reasoning, and relational understanding in multi-agent social settings. PLAYER* combines a sensor-based state representation with information-theoretic questioning and a pruning mechanism to efficiently narrow the suspect space, achieving higher reasoning accuracy, faster interaction, and better human-agent engagement than strong baselines. The results demonstrate the value of integrating structured state representations, entropy-guided questioning, and memory-aware search for complex social tasks, with practical implications for AI agents in narrative-rich, interactive environments.
Abstract
We introduce WellPlay, a reasoning dataset for multi-agent conversational inference in Murder Mystery Games (MMGs). WellPlay comprises 1,482 inferential questions across 12 games, spanning objectives, reasoning, and relationship understanding, and establishes a systematic benchmark for evaluating agent reasoning abilities in complex social settings. Building on this foundation, we present PLAYER*, a novel framework for Large Language Model (LLM)-based agents in MMGs. MMGs pose unique challenges, including undefined state spaces, absent intermediate rewards, and the need for strategic reasoning through natural language. PLAYER* addresses these challenges with a sensor-based state representation and an information-driven strategy that optimises questioning and suspect pruning. Experiments show that PLAYER* outperforms existing methods in reasoning accuracy, efficiency, and agent-human interaction, advancing reasoning agents for complex social scenarios.
