Humanoid Hanoi: Investigating Shared Whole-Body Control for Skill-Based Box Rearrangement
Minku Kim, Kuan-Chia Chen, Aayam Shrestha, Li Fuxin, Stefan Lee, Alan Fern
TL;DR
This work tackles the challenge of long-horizon humanoid box rearrangement by orchestrating reusable loco-manipulation skills through a single shared whole-body controller (WBC). It demonstrates that naive reuse of a pretrained WBC can degrade robustness, and proposes rollout-based data aggregation to expand the WBC's coverage without altering the high-level skill interfaces. The Humanoid Hanoi benchmark is introduced to quantify long-horizon performance, with simulations and hardware experiments (Digit V3) showing improved stability and success over non-shared baselines. The findings support a scalable, task-agnostic control framework for humanoids, highlighting practical gains in robustness and offering concrete directions for future improvements in perception, placement stabilization, and real-world reliability.
Abstract
We investigate a skill-based framework for humanoid box rearrangement that enables long-horizon execution by sequencing reusable skills at the task level. In our architecture, all skills execute through a shared, task-agnostic whole-body controller (WBC), providing a consistent closed-loop interface for skill composition, in contrast to non-shared designs that use separate low-level controllers per skill. We find that naively reusing the same pretrained WBC can reduce robustness over long horizons, as new skills and their compositions induce shifted state and command distributions. We address this with a simple data aggregation procedure that augments shared-WBC training with rollouts from closed-loop skill execution under domain randomization. To evaluate the approach, we introduce \emph{Humanoid Hanoi}, a long-horizon Tower-of-Hanoi box rearrangement benchmark, and report results in simulation and on the Digit V3 humanoid robot, demonstrating fully autonomous rearrangement over extended horizons and quantifying the benefits of the shared-WBC approach over non-shared baselines.
