Table of Contents
Fetching ...

Hierarchical Learning Framework for Whole-Body Model Predictive Control of a Real Humanoid Robot

Koji Ishihara, Hiroaki Gomi, Jun Morimoto

TL;DR

The paper tackles the simulation-to-real gap and high computational burden of whole-body MPC for humanoid robots. It introduces a biologically-inspired three-layer hierarchy where the upper layer learns an accurate, uncertainty-aware dynamics model via model-based RL with a deep residual augmentor, while the middle and lower layers provide fast long- and short-latency control via iterative learning and Dynamic Movement Primitives. Through 10 real-robot motions, the framework demonstrates that whole-body MPC can be leveraged effectively despite slow policy updates, with ablations confirming the necessity of each layer and the augmented dynamics. The approach advances real-time, multi-contact locomotion on humanoids and offers a path toward broader motion repertoires and robustness in dynamic environments.

Abstract

The simulation-to-real gap problem and the high computational burden of whole-body Model Predictive Control (whole-body MPC) continue to present challenges in generating a wide variety of movements using whole-body MPC for real humanoid robots. This paper presents a biologically-inspired hierarchical learning framework as a potential solution to the aforementioned problems. The proposed three-layer hierarchical framework enables the generation of multi-contact, dynamic behaviours even with low-frequency policy updates of whole-body MPC. The upper layer is responsible for learning an accurate dynamics model with the objective of reducing the discrepancy between the analytical model and the real system. This enables the computation of effective control policies using whole-body MPC. Subsequently, the middle and lower layers are tasked with learning additional policies to generate high-frequency control inputs. In order to learn an accurate dynamics model in the upper layer, an augmented model using a deep residual network is trained by model-based reinforcement learning with stochastic whole-body MPC. The proposed framework was evaluated in 10 distinct motion learning scenarios, including jogging on a flat surface and skating on curved surfaces. The results demonstrate that a wide variety of motions can be successfully generated on a real humanoid robot using whole-body MPC through learning with the proposed framework.

Hierarchical Learning Framework for Whole-Body Model Predictive Control of a Real Humanoid Robot

TL;DR

The paper tackles the simulation-to-real gap and high computational burden of whole-body MPC for humanoid robots. It introduces a biologically-inspired three-layer hierarchy where the upper layer learns an accurate, uncertainty-aware dynamics model via model-based RL with a deep residual augmentor, while the middle and lower layers provide fast long- and short-latency control via iterative learning and Dynamic Movement Primitives. Through 10 real-robot motions, the framework demonstrates that whole-body MPC can be leveraged effectively despite slow policy updates, with ablations confirming the necessity of each layer and the augmented dynamics. The approach advances real-time, multi-contact locomotion on humanoids and offers a path toward broader motion repertoires and robustness in dynamic environments.

Abstract

The simulation-to-real gap problem and the high computational burden of whole-body Model Predictive Control (whole-body MPC) continue to present challenges in generating a wide variety of movements using whole-body MPC for real humanoid robots. This paper presents a biologically-inspired hierarchical learning framework as a potential solution to the aforementioned problems. The proposed three-layer hierarchical framework enables the generation of multi-contact, dynamic behaviours even with low-frequency policy updates of whole-body MPC. The upper layer is responsible for learning an accurate dynamics model with the objective of reducing the discrepancy between the analytical model and the real system. This enables the computation of effective control policies using whole-body MPC. Subsequently, the middle and lower layers are tasked with learning additional policies to generate high-frequency control inputs. In order to learn an accurate dynamics model in the upper layer, an augmented model using a deep residual network is trained by model-based reinforcement learning with stochastic whole-body MPC. The proposed framework was evaluated in 10 distinct motion learning scenarios, including jogging on a flat surface and skating on curved surfaces. The results demonstrate that a wide variety of motions can be successfully generated on a real humanoid robot using whole-body MPC through learning with the proposed framework.
Paper Structure (20 sections, 23 equations, 18 figures, 1 table)

This paper contains 20 sections, 23 equations, 18 figures, 1 table.

Figures (18)

  • Figure 1: Overview of the proposed three-layer hierarchical learning framework: In the upper layer, an accurate dynamics model is learned and the control policy is computed by stochastic whole-body MPC. In the middle and lower layers, additional control policies are learned as long and short latency responses. The policies at each layer are sent to the low-level controller to generate the robot's motion.
  • Figure 2: Jogging generated in the early learning phase
  • Figure 3: Jogging generated in the late learning phase
  • Figure 5: One-leg standing generated in the late learning phase
  • Figure 6: Sidestepping generated in the late learning phase
  • ...and 13 more figures