Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot

Zifan Wang; Yufei Jia; Lu Shi; Haoyu Wang; Haizhou Zhao; Xueyang Li; Jinni Zhou; Jun Ma; Guyue Zhou

Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot

Zifan Wang, Yufei Jia, Lu Shi, Haoyu Wang, Haizhou Zhao, Xueyang Li, Jinni Zhou, Jun Ma, Guyue Zhou

TL;DR

An arm-constrained curriculum learning architecture to tackle issues introduced by adding the manipulator is introduced and an arm-constrained reinforcement learning algorithm is developed to ensure safety and reliability in control performance after equipping the manipulator.

Abstract

Incorporating a robotic manipulator into a wheel-legged robot enhances its agility and expands its potential for practical applications. However, the presence of potential instability and uncertainties presents additional challenges for control objectives. In this paper, we introduce an arm-constrained curriculum learning architecture to tackle the issues introduced by adding the manipulator. Firstly, we develop an arm-constrained reinforcement learning algorithm to ensure safety and stability in control performance. Additionally, to address discrepancies in reward settings between the arm and the base, we propose a reward-aware curriculum learning method. The policy is first trained in Isaac gym and transferred to the physical robot to do dynamic grasping tasks, including the door-opening task, fan-twitching task and the relay-baton-picking and following task. The results demonstrate that our proposed approach effectively controls the arm-equipped wheel-legged robot to master dynamic grasping skills, allowing it to chase and catch a moving object while in motion. Please refer to our website (https://acodedog.github.io/wheel-legged-loco-manipulation) for the code and supplemental videos.

Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot

TL;DR

Abstract

Paper Structure (23 sections, 3 equations, 6 figures, 1 table)

This paper contains 23 sections, 3 equations, 6 figures, 1 table.

Introduction
Related work
RL-based Control of Legged Robots
Loco-Manipulation for Legged Robots
Preliminary
Constrained Markov Decision Process (CMDP)
Constrained Proximal Policy Optimization
Method
Overview of the structure
Arm-Constrained Proximal Policy Optimization
Reward-Aware Curriculum Learning
Two-phase Learning using Behavior Cloning
Experiments
Experimental Setup
Observation Space
...and 8 more sections

Figures (6)

Figure 1: Tasks accomplished by the proposed architecture. Top-Left: door-opening-and-pulling task; Top-Right: fan-knob-twitching task; Bottom-Left: relay-baton-chasing task; Bottom-Right: door-opening-and-pushing task.
Figure 2: The overall illustration of the proposed framework. Top: two-phase learning procedure; Bottom: the detailed representation of the network.
Figure 3: Illustration of the arm-equipped wheel-legged robotic platform.
Figure 4: The tracking results in simulation.
Figure 5: Effect of curriculum learning for rewards and actions
...and 1 more figures

Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot

TL;DR

Abstract

Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot

Authors

TL;DR

Abstract

Table of Contents

Figures (6)