LLM-Hanabi: Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
Fangzhou Liang, Tianshi Zheng, Chunkit Chan, Yauwai Yim, Yangqiu Song
TL;DR
This work tackles the challenge of evaluating rationale inference and Theory-of-Mind (ToM) in multi-agent collaboration under imperfect information. It introduces LLM-Hanabi, a benchmark that maps Hanabi gameplay to natural-language interactions and uses a two-phase ToM evaluation (reasoning extraction during play and post-game scoring by a judge) to quantify ToM proficiency. Across a diverse set of LLMs and LRMs, ToM ability strongly correlates with cooperative success, with first-order ToM outperforming second-order ToM as a predictor of performance. The findings suggest that enabling an AI to accurately infer a partner's rationale is more crucial for collaboration than modeling higher-order beliefs, and the benchmark provides a scalable platform for future improvements in collaborative AI.
Abstract
Effective multi-agent collaboration requires agents to infer the rationale behind others' actions, a capability rooted in Theory-of-Mind (ToM). While recent Large Language Models (LLMs) excel at logical inference, their ability to infer rationale in dynamic, collaborative settings remains under-explored. This study introduces LLM-Hanabi, a novel benchmark that uses the cooperative game Hanabi to evaluate the rationale inference and ToM of LLMs. Our framework features an automated evaluation system that measures both game performance and ToM proficiency. Across a range of models, we find a significant positive correlation between ToM and in-game success. Notably, first-order ToM (interpreting others' intent) correlates more strongly with performance than second-order ToM (predicting others' interpretations). These findings highlight that for effective AI collaboration, the ability to accurately interpret a partner's rationale is more critical than higher-order reasoning. We conclude that prioritizing first-order ToM is a promising direction for enhancing the collaborative capabilities of future models.
