Table of Contents
Fetching ...

Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

Yuanquan Hu, Xiaoli Wei, Junji Yan, Hengxi Zhang

TL;DR

This paper proposes a graphon mean-field control framework to approximate cooperative multi-agent reinforcement learning (MARL) with nonuniform interactions and shows that the approximate order is of $\mathcal{O}(\frac{1}{\sqrt{N}})$, with $N$ the number of agents.

Abstract

The marriage between mean-field theory and reinforcement learning has shown a great capacity to solve large-scale control problems with homogeneous agents. To break the homogeneity restriction of mean-field theory, a recent interest is to introduce graphon theory to the mean-field paradigm. In this paper, we propose a graphon mean-field control (GMFC) framework to approximate cooperative multi-agent reinforcement learning (MARL) with nonuniform interactions and show that the approximate order is of $\mathcal{O}(\frac{1}{\sqrt{N}})$, with $N$ the number of agents. By discretizing the graphon index of GMFC, we further introduce a smaller class of GMFC called block GMFC, which is shown to well approximate cooperative MARL. Our empirical studies on several examples demonstrate that our GMFC approach is comparable with the state-of-art MARL algorithms while enjoying better scalability.

Graphon Mean-Field Control for Cooperative Multi-Agent Reinforcement Learning

TL;DR

This paper proposes a graphon mean-field control framework to approximate cooperative multi-agent reinforcement learning (MARL) with nonuniform interactions and shows that the approximate order is of , with the number of agents.

Abstract

The marriage between mean-field theory and reinforcement learning has shown a great capacity to solve large-scale control problems with homogeneous agents. To break the homogeneity restriction of mean-field theory, a recent interest is to introduce graphon theory to the mean-field paradigm. In this paper, we propose a graphon mean-field control (GMFC) framework to approximate cooperative multi-agent reinforcement learning (MARL) with nonuniform interactions and show that the approximate order is of , with the number of agents. By discretizing the graphon index of GMFC, we further introduce a smaller class of GMFC called block GMFC, which is shown to well approximate cooperative MARL. Our empirical studies on several examples demonstrate that our GMFC approach is comparable with the state-of-art MARL algorithms while enjoying better scalability.
Paper Structure (22 sections, 10 theorems, 43 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 22 sections, 10 theorems, 43 equations, 2 figures, 3 tables, 1 algorithm.

Key Result

Theorem 3.1

GMFC (equ:GMFC)-(eq:GMFC_reward) can be reformulated as subject to where the aggregated reward $R: {\pmb {\cal M}} \times {\pmb \Pi} \to \mathbb{R}$ and the aggregated transition dynamics ${\pmb \Phi}: {\pmb {\cal M}} \times {\pmb \Pi} \to {\pmb {\cal M}}$ are given by

Figures (2)

  • Figure 1: Experiments for different graphons in SIS finite-agent environment
  • Figure 2: Experiments for different graphons in Malware Spread finite-agent environment

Theorems & Definitions (13)

  • Remark 2.2
  • Definition 2.3
  • Theorem 3.1
  • Remark 3.2
  • Theorem 3.7: Approximate Pareto Property
  • Theorem 3.8: Existence of Optimal Policy Ensemble
  • Theorem 3.9
  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3: Verification
  • ...and 3 more