WoCoCo: Learning Whole-Body Humanoid Control with Sequential Contacts
Chong Zhang, Wenli Xiao, Tairan He, Guanya Shi
TL;DR
WoCoCo introduces a unified RL framework that learns whole‑body humanoid control under sequential contact plans by decomposing tasks into contact stages and employing a compact, task‑agnostic reward design (dense contact, stage progression, and curiosity). A sim‑to‑real curriculum with domain randomization and regularization enables end‑to‑end policies to transfer from simulation to real humanoids, demonstrated on four dynamic tasks and a 22‑DoF dinosaur loco‑manipulation. The approach reduces task‑specific tuning and avoids heavy motion priors, while maintaining robustness to perturbations and unseen environments. Collectively, WoCoCo showcases broad applicability of sequential‑contact RL for versatile, real‑world locomotion and manipulation across embodiments.
Abstract
Humanoid activities involving sequential contacts are crucial for complex robotic interactions and operations in the real world and are traditionally solved by model-based motion planning, which is time-consuming and often relies on simplified dynamics models. Although model-free reinforcement learning (RL) has become a powerful tool for versatile and robust whole-body humanoid control, it still requires tedious task-specific tuning and state machine design and suffers from long-horizon exploration issues in tasks involving contact sequences. In this work, we propose WoCoCo (Whole-Body Control with Sequential Contacts), a unified framework to learn whole-body humanoid control with sequential contacts by naturally decomposing the tasks into separate contact stages. Such decomposition facilitates simple and general policy learning pipelines through task-agnostic reward and sim-to-real designs, requiring only one or two task-related terms to be specified for each task. We demonstrated that end-to-end RL-based controllers trained with WoCoCo enable four challenging whole-body humanoid tasks involving diverse contact sequences in the real world without any motion priors: 1) versatile parkour jumping, 2) box loco-manipulation, 3) dynamic clap-and-tap dancing, and 4) cliffside climbing. We further show that WoCoCo is a general framework beyond humanoid by applying it in 22-DoF dinosaur robot loco-manipulation tasks.
