Learning a Distributed Hierarchical Locomotion Controller for Embodied Cooperation

Chuye Hong; Kangyao Huang; Huaping Liu

Learning a Distributed Hierarchical Locomotion Controller for Embodied Cooperation

Chuye Hong, Kangyao Huang, Huaping Liu

TL;DR

This work proposes a distributed hierarchical locomotion control strategy for whole-body cooperation and demonstrates the potential for migration into large numbers of agents, and constructs a set of environments as the benchmark for embodied cooperation.

Abstract

In this work, we propose a distributed hierarchical locomotion control strategy for whole-body cooperation and demonstrate the potential for migration into large numbers of agents. Our method utilizes a hierarchical structure to break down complex tasks into smaller, manageable sub-tasks. By incorporating spatiotemporal continuity features, we establish the sequential logic necessary for causal inference and cooperative behaviour in sequential tasks, thereby facilitating efficient and coordinated control strategies. Through training within this framework, we demonstrate enhanced adaptability and cooperation, leading to superior performance in task completion compared to the original methods. Moreover, we construct a set of environments as the benchmark for embodied cooperation.

Learning a Distributed Hierarchical Locomotion Controller for Embodied Cooperation

TL;DR

Abstract

Paper Structure (33 sections, 2 equations, 8 figures, 1 table)

This paper contains 33 sections, 2 equations, 8 figures, 1 table.

Introduction
Related Works
Embodied Cooperation
HRL in locomotion Control
Problem Statement
Distributed Hierarchical Reinforcement Learning
Task Decomposition and Hierarchical Learning
Distributed Scalable HRL
Spatiotemporal Memory Recurrence
Training Curriculum
Experiments
Environments Construction
Results of Distributed HRL
Training Basic Locomotion Control
Training the Interactive Collaboration
...and 18 more sections

Figures (8)

Figure 1: Embodied Cooperation Environments.
Figure 2: Distributed HRL pipeline: demonstrate the information flow within the distributed hierarchical reinforcement learning, using scenario Cooperative Transport as an example. Here we maintain a centralized training but distributed control hierarchical reinforcement learning framework, where we decompose the complex behaviours into three levels: Upper Layer (UL), Middle Layer (ML), and Lower Layer (LL). UL module processes the external information $e_{t}$ including environmental perception and relative position to colleagues, extracting features and sending to ML; ML module is a recurrent neural network layer that maintains a recurrent state $h$ considering temporal and spatial correlation, and outputs a locomotion command into LL module; LL module is a pre-trained locomotion control layer that generates action $a_t$ and applies it to agent according to the proprioceptive observation $p_t$, where we have two modes: position and velocity. The goal of Cooperative Transport is to move a cylinder object to the red target zone collaboratively by a group of Ant robots.
Figure 3: Exteroceptive state configuration in distributed HRL.
Figure 4: Distributed HRL results for all scenarios.
Figure 5: (a) shows the training results of different Ant populations used in Cooperative Transport. (b)(c)shows the result training with large agent numbers.
...and 3 more figures

Learning a Distributed Hierarchical Locomotion Controller for Embodied Cooperation

TL;DR

Abstract

Learning a Distributed Hierarchical Locomotion Controller for Embodied Cooperation

Authors

TL;DR

Abstract

Table of Contents

Figures (8)