Table of Contents
Fetching ...

Explaining Decentralized Multi-Agent Reinforcement Learning Policies

Kayla Boggess, Sarit Kraus, Lu Feng

TL;DR

This work addresses the challenge of explaining decentralized MARL policies by introducing Hasse Diagram Summarization (HDS), a compact, order-preserving representation of inter-agent task coordination and uncertainty. It couples HDS with query-based explanations for When, Why Not, and What using partial comparability graphs, an uncertainty dictionary, and Quine–McCluskey minimization to produce interpretable natural-language outputs. The approach is demonstrated across four MARL domains and two training paradigms (CTDE and DTDE), showing scalable summarization and significantly improved user performance and satisfaction in both summarization and explanation studies. The results support the practical utility of decentralized explanations for human-agent collaboration and suggest future paths toward interactive systems and integration with language models for clarity and usability. $O(N|T|^2) + O(|T|^4)$ is the worst-case complexity for HDS construction, illustrating scalability to sizable task sets and agent populations.

Abstract

Multi-Agent Reinforcement Learning (MARL) has gained significant interest in recent years, enabling sequential decision-making across multiple agents in various domains. However, most existing explanation methods focus on centralized MARL, failing to address the uncertainty and nondeterminism inherent in decentralized settings. We propose methods to generate policy summarizations that capture task ordering and agent cooperation in decentralized MARL policies, along with query-based explanations for When, Why Not, and What types of user queries about specific agent behaviors. We evaluate our approach across four MARL domains and two decentralized MARL algorithms, demonstrating its generalizability and computational efficiency. User studies show that our summarizations and explanations significantly improve user question-answering performance and enhance subjective ratings on metrics such as understanding and satisfaction.

Explaining Decentralized Multi-Agent Reinforcement Learning Policies

TL;DR

This work addresses the challenge of explaining decentralized MARL policies by introducing Hasse Diagram Summarization (HDS), a compact, order-preserving representation of inter-agent task coordination and uncertainty. It couples HDS with query-based explanations for When, Why Not, and What using partial comparability graphs, an uncertainty dictionary, and Quine–McCluskey minimization to produce interpretable natural-language outputs. The approach is demonstrated across four MARL domains and two training paradigms (CTDE and DTDE), showing scalable summarization and significantly improved user performance and satisfaction in both summarization and explanation studies. The results support the practical utility of decentralized explanations for human-agent collaboration and suggest future paths toward interactive systems and integration with language models for clarity and usability. is the worst-case complexity for HDS construction, illustrating scalability to sizable task sets and agent populations.

Abstract

Multi-Agent Reinforcement Learning (MARL) has gained significant interest in recent years, enabling sequential decision-making across multiple agents in various domains. However, most existing explanation methods focus on centralized MARL, failing to address the uncertainty and nondeterminism inherent in decentralized settings. We propose methods to generate policy summarizations that capture task ordering and agent cooperation in decentralized MARL policies, along with query-based explanations for When, Why Not, and What types of user queries about specific agent behaviors. We evaluate our approach across four MARL domains and two decentralized MARL algorithms, demonstrating its generalizability and computational efficiency. User studies show that our summarizations and explanations significantly improve user question-answering performance and enhance subjective ratings on metrics such as understanding and satisfaction.

Paper Structure

This paper contains 18 sections, 3 theorems, 10 figures, 6 tables, 4 algorithms.

Key Result

Theorem 1

Given a set of agent trajectories $\{\omega^i\}_{i=1}^N$ produced by executing decentralized MARL policies $\{\pi^i\}_{i=1}^N$ in a single episode, the Hasse diagram ${\mathcal{D}} = ({\mathcal{V}}, {\mathcal{E}})$ constructed by Algorithm ag:hds is both a correct and complete policy summarization.

Figures (10)

  • Figure 1: Example of Algorithm \ref{['ag:hds']} constructing a Hasse diagram incrementally: steps (a)–(d) incorporate each agent's task sequence, and step (e) applies transitive reduction.
  • Figure 2: Mean and standard deviation of participant ratings on policy summarizations (* indicates statistically significant difference).
  • Figure 3: Mean and standard deviation of participant performance on query-based explanations (* indicates statistically significant difference).
  • Figure 4: Mean and standard deviation of participant ratings on query-based explanations (* indicates statistically significant difference).
  • Figure 5: Example of baseline summarization.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 1
  • Theorem 1
  • proof