Table of Contents
Fetching ...

Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning

Zhiwei Xu, Hangyu Mao, Nianmin Zhang, Xin Xin, Pengjie Ren, Dapeng Li, Bin Zhang, Guoliang Fan, Zhumin Chen, Changwei Wang, Jiangjin Yin

TL;DR

This paper confronts the challenge of partial observability in cooperative multi-agent reinforcement learning by enabling agents to infer a global state from local observations. It proposes SIDIFF, a diffusion-based State Generator paired with a Vision Transformer based State Extractor, to reconstruct and distill global state information that informs decentralized actions. The approach is compatible with CTDE and is shown to boost performance across SMAC, VMAS, and the new MABC environment, outperforming several baselines. It highlights the value of explicit global-state inference for online MARL and points to future work on faster diffusion methods and multi-task extensions.

Abstract

In partially observable multi-agent systems, agents typically only have access to local observations. This severely hinders their ability to make precise decisions, particularly during decentralized execution. To alleviate this problem and inspired by image outpainting, we propose State Inference with Diffusion Models (SIDIFF), which uses diffusion models to reconstruct the original global state based solely on local observations. SIDIFF consists of a state generator and a state extractor, which allow agents to choose suitable actions by considering both the reconstructed global state and local observations. In addition, SIDIFF can be effortlessly incorporated into current multi-agent reinforcement learning algorithms to improve their performance. Finally, we evaluated SIDIFF on different experimental platforms, including Multi-Agent Battle City (MABC), a novel and flexible multi-agent reinforcement learning environment we developed. SIDIFF achieved desirable results and outperformed other popular algorithms.

Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning

TL;DR

This paper confronts the challenge of partial observability in cooperative multi-agent reinforcement learning by enabling agents to infer a global state from local observations. It proposes SIDIFF, a diffusion-based State Generator paired with a Vision Transformer based State Extractor, to reconstruct and distill global state information that informs decentralized actions. The approach is compatible with CTDE and is shown to boost performance across SMAC, VMAS, and the new MABC environment, outperforming several baselines. It highlights the value of explicit global-state inference for online MARL and points to future work on faster diffusion methods and multi-task extensions.

Abstract

In partially observable multi-agent systems, agents typically only have access to local observations. This severely hinders their ability to make precise decisions, particularly during decentralized execution. To alleviate this problem and inspired by image outpainting, we propose State Inference with Diffusion Models (SIDIFF), which uses diffusion models to reconstruct the original global state based solely on local observations. SIDIFF consists of a state generator and a state extractor, which allow agents to choose suitable actions by considering both the reconstructed global state and local observations. In addition, SIDIFF can be effortlessly incorporated into current multi-agent reinforcement learning algorithms to improve their performance. Finally, we evaluated SIDIFF on different experimental platforms, including Multi-Agent Battle City (MABC), a novel and flexible multi-agent reinforcement learning environment we developed. SIDIFF achieved desirable results and outperformed other popular algorithms.
Paper Structure (27 sections, 10 equations, 12 figures, 2 tables, 2 algorithms)

This paper contains 27 sections, 10 equations, 12 figures, 2 tables, 2 algorithms.

Figures (12)

  • Figure 1: In the multi-agent system, an agent under the SIDIFF framework chooses reasonable actions after two steps: 1) reconstructing the state and 2) extracting information.
  • Figure 2: The overall framework and workflow of SIDIFF.
  • Figure 3: Performance comparison with baselines in SMAC. SIDIFF-QMIX outperforms QMIX in almost all scenarios.
  • Figure 4: Comparison of our approach against baseline algorithms on Vectorized Multi-Agent Simulator.
  • Figure 5: Screenshots of tasks in Multi-Agent Battle City.
  • ...and 7 more figures