Efficient Multiagent Planning via Shared Action Suggestions
Dylan M. Asmar, Mykel J. Kochenderfer
TL;DR
This work tackles the intractable complexity of Dec-POMDP-Com by replacing full information-sharing with action-based communication: agents exchange suggested joint actions to constrain others' beliefs and form an estimated joint belief that can drive centralized-like policies. The method prunes feasible belief subspaces using surrogate policies, employs exact reconstruction or practical conflation to form joint beliefs, and selects actions through a coordinating MCAS framework that offloads offline policy computation to MPOMDP solvers. Empirical results on a range of Dec-POMDP benchmarks show MCAS achieving near-centralized performance, with scalable online execution and controlled belief-set growth. The approach offers a practical pathway to scalable multiagent coordination, with implications for autonomous systems and human-agent teams, and points to future work on theory, online solvers, and richer forms of action-based communication.
Abstract
Decentralized partially observable Markov decision processes with communication (Dec-POMDP-Com) provide a framework for multiagent decision making under uncertainty, but the NEXP-complete complexity for finite-horizon problems renders solutions intractable in general. While sharing actions and observations can reduce the complexity to PSPACE-complete, we propose an approach that bridges POMDPs and Dec-POMDPs by communicating only suggested joint actions, eliminating the need to share observations while retaining near-centralized performance. Our algorithm estimates joint beliefs using shared actions to prune infeasible beliefs. Each agent maintains possible belief sets for other agents, pruning them based on suggested actions to form an estimated joint belief usable with any centralized policy. This approach requires solving a POMDP for each agent, reducing computational complexity while preserving performance. We demonstrate its effectiveness on several Dec-POMDP benchmarks, showing performance comparable to centralized methods when shared actions enable effective belief pruning. This action-based communication framework offers a natural avenue for integrating human-agent cooperation, opening new directions for scalable multiagent planning under uncertainty, with applications in both autonomous systems and human-agent teams.
