CSP4SDG: Constraint and Information-Theory Based Role Identification in Social Deduction Games with LLM-Enhanced Inference
Kaijie Xu, Fandi Meng, Clark Verbrugge, Simon Lucas
TL;DR
CSP4SDG addresses hidden-role inference in social deduction games by formulating it as a training-free probabilistic constraint-satisfaction problem. A lightweight LLM converts raw game logs into four constraint types (Evidence, Phenomenon, Assertions, Hypotheses), which are then pruned by hard constraints and scored with information-gain weighted soft constraints to produce calibrated posteriors Pr(r|C_t) and a MAP assignment. The approach unifies classical CSPs with information theory, yielding interpretable, updateable posteriors and enabling a plug-and-play module that can boost or stand in for LLM-based reasoning. Empirical results across Avalon, Mafia, and AvalonLogs show CSP4SDG consistently outperforms pure LLM baselines and enhances LLM reasoning when combined, highlighting principled probabilistic reasoning as a scalable complement to neural models in SDGs. The work demonstrates that structured constraint-based reasoning can achieve high accuracy with interpretability and real-time updating, offering practical value for AI agents and human analysts in deception-rich interactive domains.
Abstract
In Social Deduction Games (SDGs) such as Avalon, Mafia, and Werewolf, players conceal their identities and deliberately mislead others, making hidden-role inference a central and demanding task. Accurate role identification, which forms the basis of an agent's belief state, is therefore the keystone for both human and AI performance. We introduce CSP4SDG, a probabilistic, constraint-satisfaction framework that analyses gameplay objectively. Game events and dialogue are mapped to four linguistically-agnostic constraint classes-evidence, phenomena, assertions, and hypotheses. Hard constraints prune impossible role assignments, while weighted soft constraints score the remainder; information-gain weighting links each hypothesis to its expected value under entropy reduction, and a simple closed-form scoring rule guarantees that truthful assertions converge to classical hard logic with minimum error. The resulting posterior over roles is fully interpretable and updates in real time. Experiments on three public datasets show that CSP4SDG (i) outperforms LLM-based baselines in every inference scenario, and (ii) boosts LLMs when supplied as an auxiliary "reasoning tool." Our study validates that principled probabilistic reasoning with information theory is a scalable alternative-or complement-to heavy-weight neural models for SDGs.
