Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

Yuanquan Hu; Xiaoli Wei; Junji Yan; Hengxi Zhang

Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

Yuanquan Hu, Xiaoli Wei, Junji Yan, Hengxi Zhang

TL;DR

This paper proposes a graphon mean-field control framework to approximate cooperative multi-agent reinforcement learning (MARL) with nonuniform interactions and shows that the approximate order is of $\mathcal{O}(\frac{1}{\sqrt{N}})$, with $N$ the number of agents.

Abstract

The marriage between mean-field theory and reinforcement learning has shown a great capacity to solve large-scale control problems with homogeneous agents. To break the homogeneity restriction of mean-field theory, a recent interest is to introduce graphon theory to the mean-field paradigm. In this paper, we propose a graphon mean-field control (GMFC) framework to approximate cooperative multi-agent reinforcement learning (MARL) with nonuniform interactions and show that the approximate order is of $\mathcal{O}(\frac{1}{\sqrt{N}})$, with $N$ the number of agents. By discretizing the graphon index of GMFC, we further introduce a smaller class of GMFC called block GMFC, which is shown to well approximate cooperative MARL. Our empirical studies on several examples demonstrate that our GMFC approach is comparable with the state-of-art MARL algorithms while enjoying better scalability.

Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

TL;DR

This paper proposes a graphon mean-field control framework to approximate cooperative multi-agent reinforcement learning (MARL) with nonuniform interactions and shows that the approximate order is of

, with

the number of agents.

Abstract

, with

the number of agents. By discretizing the graphon index of GMFC, we further introduce a smaller class of GMFC called block GMFC, which is shown to well approximate cooperative MARL. Our empirical studies on several examples demonstrate that our GMFC approach is comparable with the state-of-art MARL algorithms while enjoying better scalability.

Paper Structure (22 sections, 10 theorems, 43 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 22 sections, 10 theorems, 43 equations, 2 figures, 3 tables, 1 algorithm.

Introduction
Our Work
Outline
Mean-Field MARL on Dense Graphs
Preliminary: Graphon Theory
Cooperative MARL with Nonuniform Interactions
Graphon Mean-Field Control
Main Results
Reformulation of GMFC
Approximation
Algorithm Design
Proofs of Main Results
Proof of Theorem \ref{['thm:GMFC_approximate_pareto_property']}
Proof of Theorem \ref{['thm:GMFC_existence_pareto_optimality']}
Proof of Theorem \ref{['thm:discretized_GMFC_approximate_pareto_property']}
...and 7 more sections

Key Result

Theorem 3.1

GMFC (equ:GMFC)-(eq:GMFC_reward) can be reformulated as subject to where the aggregated reward $R: {\pmb {\cal M}} \times {\pmb \Pi} \to \mathbb{R}$ and the aggregated transition dynamics ${\pmb \Phi}: {\pmb {\cal M}} \times {\pmb \Pi} \to {\pmb {\cal M}}$ are given by

Figures (2)

Figure 1: Experiments for different graphons in SIS finite-agent environment
Figure 2: Experiments for different graphons in Malware Spread finite-agent environment

Theorems & Definitions (13)

Remark 2.2
Definition 2.3
Theorem 3.1
Remark 3.2
Theorem 3.7: Approximate Pareto Property
Theorem 3.8: Existence of Optimal Policy Ensemble
Theorem 3.9
Lemma 4.1
Lemma 4.2
Lemma 4.3: Verification
...and 3 more

Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

TL;DR

Abstract

Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (13)