The Five Ws of Multi-Agent Communication: Who Talks to Whom, When, What, and Why -- A Survey from MARL to Emergent Language and LLMs
Jingdi Chen, Hanqing Yang, Zongjun Liu, Carlee Joe-Wong
TL;DR
This survey unifies three MA-Comm strands—MARL-based learned communication, Emergent Language, and LLM-powered coordination—under the Five Ws framework: who communicates with whom, what is shared, when communication occurs, why it helps, and how it is implemented. It traces the evolution from hand-designed MARL protocols to end-to-end learned messaging, discrete emergent languages, and language-grounded LLM systems, and it articulates bridging patterns across paradigms. The work highlights practical design patterns, cross-paradigm gaps, and open challenges—grounding, interpretability, scalability, and theoretical guarantees—to guide future hybrid systems that couple learning, language, and control. By providing a cross-cutting taxonomy, bridge sections, and forward-looking open problems, the paper aims to catalyze robust, scalable, and human-aligned multi-agent communication in real-world settings.
Abstract
Multi-agent sequential decision-making powers many real-world systems, from autonomous vehicles and robotics to collaborative AI assistants. In dynamic, partially observable environments, communication is often what reduces uncertainty and makes collaboration possible. This survey reviews multi-agent communication (MA-Comm) through the Five Ws: who communicates with whom, what is communicated, when communication occurs, and why communication is beneficial. This framing offers a clean way to connect ideas across otherwise separate research threads. We trace how communication approaches have evolved across three major paradigms. In Multi-Agent Reinforcement Learning (MARL), early methods used hand-designed or implicit protocols, followed by end-to-end learned communication optimized for reward and control. While successful, these protocols are frequently task-specific and hard to interpret, motivating work on Emergent Language (EL), where agents can develop more structured or symbolic communication through interaction. EL methods, however, still struggle with grounding, generalization, and scalability, which has fueled recent interest in large language models (LLMs) that bring natural language priors for reasoning, planning, and collaboration in more open-ended settings. Across MARL, EL, and LLM-based systems, we highlight how different choices shape communication design, where the main trade-offs lie, and what remains unsolved. We distill practical design patterns and open challenges to support future hybrid systems that combine learning, language, and control for scalable and interpretable multi-agent collaboration.
