Table of Contents
Fetching ...

Symmetries-enhanced Multi-Agent Reinforcement Learning

Nikolaos Bousias, Stefanos Pertigkiozoglou, Kostas Daniilidis, George Pappas

TL;DR

This work tackles generalization and scalability challenges in multi-agent reinforcement learning by introducing extrinsic symmetries as a policy inductive bias and formalizing them within a geometric framework. It proposes the Group Equivariant Graphormer, a modular, group-canonicalized architecture that can realize $G$-equivariance on tensorial graph features for distributed swarming tasks, including $SE(3)$-based scenarios. The authors show that, under a $G$-equivariant formulation, the optimal policy is itself $G$-equivariant and that non-equivariant dynamics can be lifted to an extended $G$-equivariant system with a symmetry-breaking projection to recover the original policy. Empirically, the method yields substantial gains in generalization and zero-shot scalability, achieving lower collision rates and higher task success across varying swarm sizes for symmetry-breaking quadrotors.

Abstract

Multi-agent reinforcement learning has emerged as a powerful framework for enabling agents to learn complex, coordinated behaviors but faces persistent challenges regarding its generalization, scalability and sample efficiency. Recent advancements have sought to alleviate those issues by embedding intrinsic symmetries of the systems in the policy. Yet, most dynamical systems exhibit little to no symmetries to exploit. This paper presents a novel framework for embedding extrinsic symmetries in multi-agent system dynamics that enables the use of symmetry-enhanced methods to address systems with insufficient intrinsic symmetries, expanding the scope of equivariant learning to a wide variety of MARL problems. Central to our framework is the Group Equivariant Graphormer, a group-modular architecture specifically designed for distributed swarming tasks. Extensive experiments on a swarm of symmetry-breaking quadrotors validate the effectiveness of our approach, showcasing its potential for improved generalization and zero-shot scalability. Our method achieves significant reductions in collision rates and enhances task success rates across a diverse range of scenarios and varying swarm sizes.

Symmetries-enhanced Multi-Agent Reinforcement Learning

TL;DR

This work tackles generalization and scalability challenges in multi-agent reinforcement learning by introducing extrinsic symmetries as a policy inductive bias and formalizing them within a geometric framework. It proposes the Group Equivariant Graphormer, a modular, group-canonicalized architecture that can realize -equivariance on tensorial graph features for distributed swarming tasks, including -based scenarios. The authors show that, under a -equivariant formulation, the optimal policy is itself -equivariant and that non-equivariant dynamics can be lifted to an extended -equivariant system with a symmetry-breaking projection to recover the original policy. Empirically, the method yields substantial gains in generalization and zero-shot scalability, achieving lower collision rates and higher task success across varying swarm sizes for symmetry-breaking quadrotors.

Abstract

Multi-agent reinforcement learning has emerged as a powerful framework for enabling agents to learn complex, coordinated behaviors but faces persistent challenges regarding its generalization, scalability and sample efficiency. Recent advancements have sought to alleviate those issues by embedding intrinsic symmetries of the systems in the policy. Yet, most dynamical systems exhibit little to no symmetries to exploit. This paper presents a novel framework for embedding extrinsic symmetries in multi-agent system dynamics that enables the use of symmetry-enhanced methods to address systems with insufficient intrinsic symmetries, expanding the scope of equivariant learning to a wide variety of MARL problems. Central to our framework is the Group Equivariant Graphormer, a group-modular architecture specifically designed for distributed swarming tasks. Extensive experiments on a swarm of symmetry-breaking quadrotors validate the effectiveness of our approach, showcasing its potential for improved generalization and zero-shot scalability. Our method achieves significant reductions in collision rates and enhances task success rates across a diverse range of scenarios and varying swarm sizes.
Paper Structure (20 sections, 12 theorems, 19 equations, 3 figures, 2 tables)

This paper contains 20 sections, 12 theorems, 19 equations, 3 figures, 2 tables.

Key Result

Theorem 1

The optimal control policy $\pi^{*}(x_i(t)\cup o_i(t))$ for the G-equivariant multi-robot problem, i.e. equation (eq:prob) with definition (eq:G-equivariant_problem), is equivariant under group actions from elements of $G$. (Proof is appended in Appendix I of the supplemental materialsupplemental.)

Figures (3)

  • Figure 1: Instances of the swarm in various scenarios.
  • Figure 2: Architecture schematic.
  • Figure 3: Instances of the swarm in various scenarios

Theorems & Definitions (26)

  • Definition 1
  • Definition 2
  • Definition 3
  • Theorem 1
  • Lemma 1
  • Theorem 2
  • remark 1
  • proposition 1
  • Lemma 2
  • proof : Theorem \ref{['theorem:equivariant_optimal_control']}
  • ...and 16 more