${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Dingyang Chen; Qi Zhang

${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Dingyang Chen, Qi Zhang

TL;DR

The paper addresses cooperative MARL under continuous ${E}(3)$-symmetries by formulating group-symmetric Markov games and proving the existence of $G$-invariant optimal values and policies. It then introduces ${E}(3)$-equivariant actor-critic architectures based on steerable GNNs (SEGNNs) within a centralized training/decentralized execution framework, ensuring equivariance to Euclidean transformations. Empirically, the approach yields superior sample efficiency and generalization across MPE, MuJoCo, and SMAC benchmarks, including zero-shot and transfer capabilities, though gains can be environment-dependent when symmetries are imperfect or actions are discrete. The work provides practical code and demonstrates how geometry-aware inductive biases can systematically improve multi-agent learning in physically grounded domains.

Abstract

Identification and analysis of symmetrical patterns in the natural world have led to significant discoveries across various scientific fields, such as the formulation of gravitational laws in physics and advancements in the study of chemical structures. In this paper, we focus on exploiting Euclidean symmetries inherent in certain cooperative multi-agent reinforcement learning (MARL) problems and prevalent in many applications. We begin by formally characterizing a subclass of Markov games with a general notion of symmetries that admits the existence of symmetric optimal values and policies. Motivated by these properties, we design neural network architectures with symmetric constraints embedded as an inductive bias for multi-agent actor-critic methods. This inductive bias results in superior performance in various cooperative MARL benchmarks and impressive generalization capabilities such as zero-shot learning and transfer learning in unseen scenarios with repeated symmetric patterns. The code is available at: https://github.com/dchen48/E3AC.

${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

TL;DR

The paper addresses cooperative MARL under continuous

-symmetries by formulating group-symmetric Markov games and proving the existence of

-invariant optimal values and policies. It then introduces

-equivariant actor-critic architectures based on steerable GNNs (SEGNNs) within a centralized training/decentralized execution framework, ensuring equivariance to Euclidean transformations. Empirically, the approach yields superior sample efficiency and generalization across MPE, MuJoCo, and SMAC benchmarks, including zero-shot and transfer capabilities, though gains can be environment-dependent when symmetries are imperfect or actions are discrete. The work provides practical code and demonstrates how geometry-aware inductive biases can systematically improve multi-agent learning in physically grounded domains.

Abstract

Paper Structure (43 sections, 2 theorems, 16 equations, 10 figures, 4 tables)

This paper contains 43 sections, 2 theorems, 16 equations, 10 figures, 4 tables.

Introduction
Related Work
Preliminaries
Cooperative Markov Games
Groups and Transformations
Markov Games with Euclidean Symmetries
Lg-Equivariant Multi-Agent Actor-Critic
Lg-Equivariant Message Passing
Integration Into Multi-Agent Actor-Critic Methods
Experiments
Results on MPE
Results on (Multi-Agent) MuJoCo Tasks
Results on SMAC
Conclusion
Proof of Theorem \ref{['theorem:Main properties of $G$-symmetric MGs']}
...and 28 more sections

Key Result

Theorem 4.4

For a $G$-symmetric MG, Further, for a $G$-invariant observation-based policy $\nu$,

Figures (10)

Figure 1: Illustration of Cooperative Navigation ($N=3$) and its Euclidean symmetries.
Figure 2: Emergence of rotation- and translation-invariancy in MLP actors trained on 3-agent Cooperative Navigation.
Figure 3: Architectures for ${\rm E}(3)$-equivariant MADDPG.
Figure 4: Performance comparison on MPE.
Figure 5: Performance of zero-shot and transfer learning.
...and 5 more figures

Theorems & Definitions (8)

Definition 4.1: $G$-symmetric MG
Definition 4.2: ${\rm E}(3)$-symmetric MGs
Definition 4.3: $G$-invariant MG policies
Theorem 4.4: Main properties of $G$-symmetric MGs, proof in the appendix
Definition 1.1: MG homomorphism
Definition 1.2: Policy lifting in MG homomorphisms
Theorem 1.3: Value equivalence in MG homomorphisms
proof : Proof of Theorem \ref{['theorem:Value equivalence in MG homomorphisms']}

${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

TL;DR

Abstract

${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (8)