Multi-agent Reinforcement Learning: A Comprehensive Survey
Dom Huh, Prasant Mohapatra
TL;DR
This survey analyzes MARL through the lens of game theory and machine learning, outlining how decentralized multi-agent environments, stochastic games, and various GT concepts shape learning dynamics. It provides a comprehensive taxonomy of MAS representations, learning paradigms (CTCE/DTDE/CTDE), credit assignment, communication, MOA, ad-hoc team-play, and social learning, while detailing core challenges like non-stationarity, scalability, and evaluation. The work emphasizes integrating GT insights with data-driven MARL methods to guide robust, scalable, and socially aware agent coordination, and highlights directions such as foundation models, open-source ecosystems, and advanced communication and MOA techniques. Overall, the paper offers a holistic framework for understanding MARL’s current state and paves the way for future, GT-informed ML advances in multi-agent control.
Abstract
Multi-agent systems (MAS) are widely prevalent and crucially important in numerous real-world applications, where multiple agents must make decisions to achieve their objectives in a shared environment. Despite their ubiquity, the development of intelligent decision-making agents in MAS poses several open challenges to their effective implementation. This survey examines these challenges, placing an emphasis on studying seminal concepts from game theory (GT) and machine learning (ML) and connecting them to recent advancements in multi-agent reinforcement learning (MARL), i.e. the research of data-driven decision-making within MAS. Therefore, the objective of this survey is to provide a comprehensive perspective along the various dimensions of MARL, shedding light on the unique opportunities that are presented in MARL applications while highlighting the inherent challenges that accompany this potential. Therefore, we hope that our work will not only contribute to the field by analyzing the current landscape of MARL but also motivate future directions with insights for deeper integration of concepts from related domains of GT and ML. With this in mind, this work delves into a detailed exploration of recent and past efforts of MARL and its related fields and describes prior solutions that were proposed and their limitations, as well as their applications.
