Table of Contents
Fetching ...

A Unified and General Humanoid Whole-Body Controller for Versatile Locomotion

Yufei Xue, Wentao Dong, Minghuan Liu, Weinan Zhang, Jiangmiao Pang

TL;DR

The paper addresses the challenge of achieving versatile, controllable locomotion for humanoid robots by introducing HugWBC, a unified whole-body controller. It defines a general command space that couples task goals with behavior-aware gait parameters, and trains a single policy (with hopping as an exception) via asymmetric actor-critic reinforcement learning, enhanced with a symmetry loss and curriculum-based intervention training. Key contributions include the extended command-space design, a comprehensive reward structure with periodic gait cues and foot trajectory planning, and demonstrable sim-to-real transfer on a Unitree H1 across walking, standing, jumping, and hopping, including real-time upper-body interventions. The results show improved tracking accuracy, robustness to upper-body interventions, and insightful analysis of how command combinations influence gait, underscoring HugWBC's potential as a general low-level controller for versatile loco-manipulation tasks in real-world robotics.

Abstract

Locomotion is a fundamental skill for humanoid robots. However, most existing works make locomotion a single, tedious, unextendable, and unconstrained movement. This limits the kinematic capabilities of humanoid robots. In contrast, humans possess versatile athletic abilities-running, jumping, hopping, and finely adjusting gait parameters such as frequency and foot height. In this paper, we investigate solutions to bring such versatility into humanoid locomotion and thereby propose HugWBC: a unified and general humanoid whole-body controller for versatile locomotion. By designing a general command space in the aspect of tasks and behaviors, along with advanced techniques like symmetrical loss and intervention training for learning a whole-body humanoid controlling policy in simulation, HugWBC enables real-world humanoid robots to produce various natural gaits, including walking, jumping, standing, and hopping, with customizable parameters such as frequency, foot swing height, further combined with different body height, waist rotation, and body pitch. Beyond locomotion, HugWBC also supports real-time interventions from external upper-body controllers like teleoperation, enabling loco-manipulation with precision under any locomotive behavior. Extensive experiments validate the high tracking accuracy and robustness of HugWBC with/without upper-body intervention for all commands, and we further provide an in-depth analysis of how the various commands affect humanoid movement and offer insights into the relationships between these commands. To our knowledge, HugWBC is the first humanoid whole-body controller that supports such versatile locomotion behaviors with high robustness and flexibility.

A Unified and General Humanoid Whole-Body Controller for Versatile Locomotion

TL;DR

The paper addresses the challenge of achieving versatile, controllable locomotion for humanoid robots by introducing HugWBC, a unified whole-body controller. It defines a general command space that couples task goals with behavior-aware gait parameters, and trains a single policy (with hopping as an exception) via asymmetric actor-critic reinforcement learning, enhanced with a symmetry loss and curriculum-based intervention training. Key contributions include the extended command-space design, a comprehensive reward structure with periodic gait cues and foot trajectory planning, and demonstrable sim-to-real transfer on a Unitree H1 across walking, standing, jumping, and hopping, including real-time upper-body interventions. The results show improved tracking accuracy, robustness to upper-body interventions, and insightful analysis of how command combinations influence gait, underscoring HugWBC's potential as a general low-level controller for versatile loco-manipulation tasks in real-world robotics.

Abstract

Locomotion is a fundamental skill for humanoid robots. However, most existing works make locomotion a single, tedious, unextendable, and unconstrained movement. This limits the kinematic capabilities of humanoid robots. In contrast, humans possess versatile athletic abilities-running, jumping, hopping, and finely adjusting gait parameters such as frequency and foot height. In this paper, we investigate solutions to bring such versatility into humanoid locomotion and thereby propose HugWBC: a unified and general humanoid whole-body controller for versatile locomotion. By designing a general command space in the aspect of tasks and behaviors, along with advanced techniques like symmetrical loss and intervention training for learning a whole-body humanoid controlling policy in simulation, HugWBC enables real-world humanoid robots to produce various natural gaits, including walking, jumping, standing, and hopping, with customizable parameters such as frequency, foot swing height, further combined with different body height, waist rotation, and body pitch. Beyond locomotion, HugWBC also supports real-time interventions from external upper-body controllers like teleoperation, enabling loco-manipulation with precision under any locomotive behavior. Extensive experiments validate the high tracking accuracy and robustness of HugWBC with/without upper-body intervention for all commands, and we further provide an in-depth analysis of how the various commands affect humanoid movement and offer insights into the relationships between these commands. To our knowledge, HugWBC is the first humanoid whole-body controller that supports such versatile locomotion behaviors with high robustness and flexibility.

Paper Structure

This paper contains 35 sections, 23 equations, 10 figures, 11 tables.

Figures (10)

  • Figure 1: Humanoid capabilities supported by HugWBC.First row:HugWBC allows four standard gaits - walking, jumping, standing, and hopping - with multiple customizable parameters to adjust the foot and pose behaviors, using one policy for 3 of the 4 gaits. Second row:HugWBC supports real-time interventions from external upper-body controllers, enabling loco-manipulation while maintaining precise control under any locomotive behavior. Third row: Various command combinations enable the robot to perform highly dynamic movements.
  • Figure 2: Framework of HugHBC. Illustration with the Unitree H1 robot. a): Visualization of parts of commands. The side view (left) highlights the linear velocity, foot swing height, and body pitch commands. The top-right view shows the angular velocity and waist yaw commands, and the bottom-right view shows the body height command. b): Policy inputs/outputs. The policy is provided with commands, proprioceptive observations, the intervention indicator, and outputs all joints of the robots. c): Illustrations of four gaits on the robot without/with external intervention. By default, the policy controls both the upper-body and the lower-body joints. d): External control support. Feasible external control signals can be seamlessly integrated into the robot's behavior without hurting locomotion performance.
  • Figure 3: The expected contact probability function $C(\phi_{t,i})$ in the loose and normal formulation. The larger $C(\phi_{t,i})$, the higher the expectation of contact with the ground. The CDF of the normal distribution is introduced into the normal contact probability function to relax the constraint of the foot contact at the switching boundary, resulting in a smooth transition between the swing and the stance phase.
  • Figure 4: Phase variables and clock functions under different gaits.Left: The purple ring represents the phase variable $\phi_1$ for the left foot, while the green ring represents the phase variable $\phi_2$ for the right foot. $\psi$ is the phase offset from $\phi_1$ to $\phi_2$. The dividing phase between stance (marked in blue) and swing (marked in yellow) is the duty cycle $\phi_{\text{stance}} (0.5)$. Right: The purple line depicts the clock function $Cl_L(t)$ for the left foot over a cycle, while the green line represents the clock function $Cl_R (t)$ for the right foot over a cycle.
  • Figure 5: Intervention noise curriculum. We illustrate sampled noise by visualizing the hand positions relative to the visualized robot hand joints. Top 1-3: Noise samples of three curriculum stages with noise levels ranging from small to large. These noises are only relative to the robot hand joints as visualized in the figures. Bottom 4: Front and top views of the noise samples from the final noise curriculum.
  • ...and 5 more figures