A Semi-Decentralized Approach to Multiagent Control

Mahdi Al-Husseini; Mykel J. Kochenderfer; Kyle H. Wray

A Semi-Decentralized Approach to Multiagent Control

Mahdi Al-Husseini, Mykel J. Kochenderfer, Kyle H. Wray

TL;DR

This paper extends semi-decentralization to the partially observable Markov decision process (POMDP) and presents recursive small-step semi-decentralized A* (RS-SDA*), an exact algorithm for generating optimal SDec-POMDP policies.

Abstract

We introduce an expressive framework and algorithms for the semi-decentralized control of cooperative agents in environments with communication uncertainty. Whereas semi-Markov control admits a distribution over time for agent actions, semi-Markov communication, or what we refer to as semi-decentralization, gives a distribution over time for what actions and observations agents can store in their histories. We extend semi-decentralization to the partially observable Markov decision process (POMDP). The resulting SDec-POMDP unifies decentralized and multiagent POMDPs and several existing explicit communication mechanisms. We present recursive small-step semi-decentralized A* (RS-SDA*), an exact algorithm for generating optimal SDec-POMDP policies. RS-SDA* is evaluated on semi-decentralized versions of several standard benchmarks and a maritime medical evacuation scenario. This paper provides a well-defined theoretical foundation for exploring many classes of multiagent communication problems through the lens of semi-decentralization.

A Semi-Decentralized Approach to Multiagent Control

TL;DR

Abstract

Paper Structure (25 sections, 18 theorems, 21 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 18 theorems, 21 equations, 9 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Preliminaries
Dec-POMDPs
MPOMDPs
Semi-Markov Processes
Semi-Decentralization
The SDec-POMDP
Theoretical Analysis
MPOMDP
Dec-POMDP
Recursive Small-Step Semi-Decentralized A*
Experiments
Conclusion
Appendix
...and 10 more sections

Key Result

Lemma 1

SDec-POMDP and MPOMDP models are equivalent.

Figures (9)

Figure 1: A semi-decentralized multiagent evacuation scenario with probabilistic restrictions on communication. Aircraft and watercraft must coordinate under communication constraints to move patients from aid stations to hospitals.
Figure 2: The SDec-POMDP dynamic decision network, with the policy infrastructure on the left and model on the right. The green backdrop contains the blackboard with memory $M_c$ generated from the histories of communicating agents. The gray backdrop with plate notation includes the individual agent memories $M_i$. $Z$ selector nodes are selectively toggled by $\bar{\tau}$ to facilitate memory propagation $\eta$, represented by dashed lines. Policy $\psi$ edges are represented by dotted lines. The SDec-POMDP framework is flexible and can be easily modified to capture the structural and informational characteristics of different problem domains.
Figure 3: Illustrating RS-SDA* applied to SDec-Tiger using mixed component policies through stage $\sigma = 2$.
Figure 4: MaritimeMEDEVAC environment representation and centralized/decentralized/semi-decentralized optimal policy values for horizons one through eight.
Figure 5: Illustration of four of nine possible joint actions for SDec-Tiger. Agents communicate their observation histories with some probability when they listen to the same door (in green).
...and 4 more figures

Theorems & Definitions (18)

Lemma 1
Lemma 2
Lemma 3
Proposition 1
Lemma 4
Lemma 5
Lemma 6
Proposition 2
Proposition 3
Proposition 4
...and 8 more

A Semi-Decentralized Approach to Multiagent Control

TL;DR

Abstract

A Semi-Decentralized Approach to Multiagent Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (18)