Table of Contents
Fetching ...

Efficient Multiagent Planning via Shared Action Suggestions

Dylan M. Asmar, Mykel J. Kochenderfer

TL;DR

This work tackles the intractable complexity of Dec-POMDP-Com by replacing full information-sharing with action-based communication: agents exchange suggested joint actions to constrain others' beliefs and form an estimated joint belief that can drive centralized-like policies. The method prunes feasible belief subspaces using surrogate policies, employs exact reconstruction or practical conflation to form joint beliefs, and selects actions through a coordinating MCAS framework that offloads offline policy computation to MPOMDP solvers. Empirical results on a range of Dec-POMDP benchmarks show MCAS achieving near-centralized performance, with scalable online execution and controlled belief-set growth. The approach offers a practical pathway to scalable multiagent coordination, with implications for autonomous systems and human-agent teams, and points to future work on theory, online solvers, and richer forms of action-based communication.

Abstract

Decentralized partially observable Markov decision processes with communication (Dec-POMDP-Com) provide a framework for multiagent decision making under uncertainty, but the NEXP-complete complexity for finite-horizon problems renders solutions intractable in general. While sharing actions and observations can reduce the complexity to PSPACE-complete, we propose an approach that bridges POMDPs and Dec-POMDPs by communicating only suggested joint actions, eliminating the need to share observations while retaining near-centralized performance. Our algorithm estimates joint beliefs using shared actions to prune infeasible beliefs. Each agent maintains possible belief sets for other agents, pruning them based on suggested actions to form an estimated joint belief usable with any centralized policy. This approach requires solving a POMDP for each agent, reducing computational complexity while preserving performance. We demonstrate its effectiveness on several Dec-POMDP benchmarks, showing performance comparable to centralized methods when shared actions enable effective belief pruning. This action-based communication framework offers a natural avenue for integrating human-agent cooperation, opening new directions for scalable multiagent planning under uncertainty, with applications in both autonomous systems and human-agent teams.

Efficient Multiagent Planning via Shared Action Suggestions

TL;DR

This work tackles the intractable complexity of Dec-POMDP-Com by replacing full information-sharing with action-based communication: agents exchange suggested joint actions to constrain others' beliefs and form an estimated joint belief that can drive centralized-like policies. The method prunes feasible belief subspaces using surrogate policies, employs exact reconstruction or practical conflation to form joint beliefs, and selects actions through a coordinating MCAS framework that offloads offline policy computation to MPOMDP solvers. Empirical results on a range of Dec-POMDP benchmarks show MCAS achieving near-centralized performance, with scalable online execution and controlled belief-set growth. The approach offers a practical pathway to scalable multiagent coordination, with implications for autonomous systems and human-agent teams, and points to future work on theory, online solvers, and richer forms of action-based communication.

Abstract

Decentralized partially observable Markov decision processes with communication (Dec-POMDP-Com) provide a framework for multiagent decision making under uncertainty, but the NEXP-complete complexity for finite-horizon problems renders solutions intractable in general. While sharing actions and observations can reduce the complexity to PSPACE-complete, we propose an approach that bridges POMDPs and Dec-POMDPs by communicating only suggested joint actions, eliminating the need to share observations while retaining near-centralized performance. Our algorithm estimates joint beliefs using shared actions to prune infeasible beliefs. Each agent maintains possible belief sets for other agents, pruning them based on suggested actions to form an estimated joint belief usable with any centralized policy. This approach requires solving a POMDP for each agent, reducing computational complexity while preserving performance. We demonstrate its effectiveness on several Dec-POMDP benchmarks, showing performance comparable to centralized methods when shared actions enable effective belief pruning. This action-based communication framework offers a natural avenue for integrating human-agent cooperation, opening new directions for scalable multiagent planning under uncertainty, with applications in both autonomous systems and human-agent teams.

Paper Structure

This paper contains 17 sections, 4 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Example of pruning reachable beliefs that do not align with the received message $\boldsymbol{\sigma}_t^{2,1}$. This example has $n=3$, $|\mathcal{A}^i|=2$, and $|\mathcal{O}^i|=3$. The process is from agent $1$'s perspective, expanding a single belief estimate for agent $2$.