Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game
Zijing Shi, Meng Fang, Shunfeng Zheng, Shilong Deng, Ling Chen, Yali Du
TL;DR
This work tackles ad hoc teamwork in language-driven multi-agent settings by introducing AvalonPlay, a multi-round Avalon-based benchmark where a learner must deduce teammates' hidden roles with limited information. It presents CodeAct, a general LLM agent combining memory retrieval, code-driven reasoning, and a self-debugging interpreter to rapidly adapt to new teammates without predesigned coordination protocols. Experimental results show that CodeAct outperforms semantic prompting strategies (CoT, ReAct) and that GPT-4 most effectively facilitates AHT, though memory forgetting and hallucinations remain pervasive challenges. The study highlights the importance of factual memory and programmable reasoning in robust, on-the-fly collaboration, and outlines future work on autonomous communication and fact verification.
Abstract
Multi-agent collaboration with Large Language Models (LLMs) demonstrates proficiency in basic tasks, yet its efficiency in more complex scenarios remains unexplored. In gaming environments, these agents often face situations without established coordination protocols, requiring them to make intelligent inferences about teammates from limited data. This problem motivates the area of ad hoc teamwork, in which an agent may potentially cooperate with a variety of teammates to achieve a shared goal. Our study focuses on the ad hoc teamwork problem where the agent operates in an environment driven by natural language. Our findings reveal the potential of LLM agents in team collaboration, highlighting issues related to hallucinations in communication. To address this issue, we develop CodeAct, a general agent that equips LLM with enhanced memory and code-driven reasoning, enabling the repurposing of partial information for rapid adaptation to new teammates.
