Hierarchical Planning and Control for Box Loco-Manipulation
Zhaoming Xie, Jonathan Tseng, Sebastian Starke, Michiel van de Panne, C. Karen Liu
TL;DR
This paper addresses the problem of enabling a physics-based humanoid to perform box rearrangement in cluttered environments by integrating locomotion and manipulation within a four-level hierarchical stack. A kinodynamic planner provides waypoint constraints, a diffusion-based mid-level motion generator yields realistic whole-body trajectories, and imitation-based RL policies execute the low-level motor skills, including an object-aware manipulation policy. The key contributions are a hierarchical planning-and-control framework, the use of diffusion models with bidirectional root control for robust locomotion planning, and demonstrated generalization of a single pick-up/put-down motion to objects of varying weights and heights. The approach yields robust, scalable loco-manipulation capabilities with practical implications for virtual humans and robotics, while outlining concrete avenues for improvement such as unified models and dynamic replanning.
Abstract
Humans perform everyday tasks using a combination of locomotion and manipulation skills. Building a system that can handle both skills is essential to creating virtual humans. We present a physically-simulated human capable of solving box rearrangement tasks, which requires a combination of both skills. We propose a hierarchical control architecture, where each level solves the task at a different level of abstraction, and the result is a physics-based simulated virtual human capable of rearranging boxes in a cluttered environment. The control architecture integrates a planner, diffusion models, and physics-based motion imitation of sparse motion clips using deep reinforcement learning. Boxes can vary in size, weight, shape, and placement height. Code and trained control policies are provided.
