Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning

Xinming Zhang; Xianghui Wang; Lerong Zhang; Guodong Guo; Xiaoyu Shen; Wei Zhang

Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning

Xinming Zhang, Xianghui Wang, Lerong Zhang, Guodong Guo, Xiaoyu Shen, Wei Zhang

TL;DR

A novel method, Kinodynamicconstrained Stable Locomotion Control (KSLC), is proposed, integrating deep reinforcement learning with kinodynamic priors, which enables humanoid robots to accurately track velocity commands with significantly reduced gait fluctuations compared to the baseline.

Abstract

Humanoid robots offer significant versatility for performing a wide range of tasks, yet their basic ability to walk and run, especially at high velocities, remains a challenge. This letter presents a novel method that combines deep reinforcement learning with kinodynamic priors to achieve stable locomotion control (KSLC). KSLC promotes coordinated arm movements to counteract destabilizing forces, enhancing overall stability. Compared to the baseline method, KSLC provides more accurate tracking of commanded velocities and better generalization in velocity control. In simulation tests, the KSLC-enabled humanoid robot successfully tracked a target velocity of 3.5 m/s with reduced fluctuations. Sim-to-sim validation in a high-fidelity environment further confirmed its robust performance, highlighting its potential for real-world applications.

Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (22 sections, 10 equations, 6 figures, 2 tables)

This paper contains 22 sections, 10 equations, 6 figures, 2 tables.

Introduction
Related Work
DRL-based Humanoid Locomotion Control
Angular Momentum-based Humanoid Control
Preliminaries
Problem Formulation
Kinodynamics Priors
Methods
Angular Momentum-based Reward
Velocity-related Reward
Base Height
Feet Clearance
Joint Position
Curriculum Learning Strategy
Velocity Curriculum
...and 7 more sections

Figures (6)

Figure 1: (a) Training environments: Humanoid-Gym gu2024humanoid. (b) Actor-Critic network structure.
Figure 2: Humanoid robot XBot-L and Degrees-of-Freedom (DoF) configuration gu2024humanoid.
Figure 3: Angular momentum of the arms and legs under two typical locomotion status. Yellow arrows and blue arrows indicate the direction of angular momentum generated in the arms and legs, respectively.
Figure 4: Learning curves of the humanoid robot XBot-L trained with the KSLC and the baseline method at different commanded velocities.
Figure 5: Performance of the humanoid robot XBot-L across various metrics when tracking different commanded velocities, using the policy trained with the KSLC and baseline method in the Humanoid-Gym gu2024humanoid.
...and 1 more figures

Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning

TL;DR

Abstract

Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)