Table of Contents
Fetching ...

MQE: Unleashing the Power of Interaction with Multi-agent Quadruped Environment

Ziyan Xiong, Bo Chen, Shiyu Huang, Wei-Wei Tu, Zhaofeng He, Yang Gao

TL;DR

The Multi-agent Quadruped Environment is introduced, a novel platform designed to facilitate the development and evaluation of multi-agent reinforcement learning (MARL) algorithms in realistic and dynamic scenarios and indicates that hierarchical reinforcement learning can simplify task learning, but also highlights the need for advanced algorithms capable of handling the intricate dynamics of multi-agent interactions.

Abstract

The advent of deep reinforcement learning (DRL) has significantly advanced the field of robotics, particularly in the control and coordination of quadruped robots. However, the complexity of real-world tasks often necessitates the deployment of multi-robot systems capable of sophisticated interaction and collaboration. To address this need, we introduce the Multi-agent Quadruped Environment (MQE), a novel platform designed to facilitate the development and evaluation of multi-agent reinforcement learning (MARL) algorithms in realistic and dynamic scenarios. MQE emphasizes complex interactions between robots and objects, hierarchical policy structures, and challenging evaluation scenarios that reflect real-world applications. We present a series of collaborative and competitive tasks within MQE, ranging from simple coordination to complex adversarial interactions, and benchmark state-of-the-art MARL algorithms. Our findings indicate that hierarchical reinforcement learning can simplify task learning, but also highlight the need for advanced algorithms capable of handling the intricate dynamics of multi-agent interactions. MQE serves as a stepping stone towards bridging the gap between simulation and practical deployment, offering a rich environment for future research in multi-agent systems and robot learning. For open-sourced code and more details of MQE, please refer to https://ziyanx02.github.io/multiagent-quadruped-environment/ .

MQE: Unleashing the Power of Interaction with Multi-agent Quadruped Environment

TL;DR

The Multi-agent Quadruped Environment is introduced, a novel platform designed to facilitate the development and evaluation of multi-agent reinforcement learning (MARL) algorithms in realistic and dynamic scenarios and indicates that hierarchical reinforcement learning can simplify task learning, but also highlights the need for advanced algorithms capable of handling the intricate dynamics of multi-agent interactions.

Abstract

The advent of deep reinforcement learning (DRL) has significantly advanced the field of robotics, particularly in the control and coordination of quadruped robots. However, the complexity of real-world tasks often necessitates the deployment of multi-robot systems capable of sophisticated interaction and collaboration. To address this need, we introduce the Multi-agent Quadruped Environment (MQE), a novel platform designed to facilitate the development and evaluation of multi-agent reinforcement learning (MARL) algorithms in realistic and dynamic scenarios. MQE emphasizes complex interactions between robots and objects, hierarchical policy structures, and challenging evaluation scenarios that reflect real-world applications. We present a series of collaborative and competitive tasks within MQE, ranging from simple coordination to complex adversarial interactions, and benchmark state-of-the-art MARL algorithms. Our findings indicate that hierarchical reinforcement learning can simplify task learning, but also highlight the need for advanced algorithms capable of handling the intricate dynamics of multi-agent interactions. MQE serves as a stepping stone towards bridging the gap between simulation and practical deployment, offering a rich environment for future research in multi-agent systems and robot learning. For open-sourced code and more details of MQE, please refer to https://ziyanx02.github.io/multiagent-quadruped-environment/ .
Paper Structure (29 sections, 4 figures, 3 tables)

This paper contains 29 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Agents learning to herd sheep in hundreds of parallel environments.
  • Figure 2: Demonstration of benchmarking tasks. Blue and green robots are assigned to accomplish collaborative tasks, while red and orange robots will play against them. Arrows of different colors illustrate the intended movements of agents and objects in each task. Generally, tasks demonstrated lower are harder due to the rising demands of advanced locomotion control and environment awareness.
  • Figure 3: Learning curves of 6 tasks under 2 settings: results using pre-trained locomotion policy is illustrated left and results learning from scratch is illustrated right. Rewards of tasks related to sheep and boxes are jiggling on a small scale due to the reset of corresponding objects.
  • Figure 4: Learning curves of two selected rewards in PPO learning of Narrow Gate within first 20M environment steps (average of 5 seeds).