DreamControl: Human-Inspired Whole-Body Humanoid Control for Scene Interaction via Guided Diffusion
Dvij Kalaria, Sudarshan S Harithas, Pushkal Katara, Sangkyung Kwak, Sarthak Bhagat, Shankar Sastry, Srinath Sridhar, Sai Vemprala, Ashish Kapoor, Jonathan Chung-Kuan Huang
TL;DR
DreamControl addresses the challenge of learning autonomous whole-body humanoid skills by fusing a diffusion-based human motion prior with reinforcement learning. A diffusion prior guided by text and spatiotemporal cues generates reference trajectories, which are then retargeted to a Unitree G1 and used to train a goal-conditioned RL policy to execute tasks in simulation and transfer to real hardware. The approach yields more natural, stable, and task-consistent motions than baselines, with strong sim2real performance and broad task coverage. By reducing reliance on teleoperation data and leveraging abundant human motion data, DreamControl offers a data-efficient route to scalable, scene-interacting humanoid control across diverse morphologies and tasks.
Abstract
We introduce DreamControl, a novel methodology for learning autonomous whole-body humanoid skills. DreamControl leverages the strengths of diffusion models and Reinforcement Learning (RL): our core innovation is the use of a diffusion prior trained on human motion data, which subsequently guides an RL policy in simulation to complete specific tasks of interest (e.g., opening a drawer or picking up an object). We demonstrate that this human motion-informed prior allows RL to discover solutions unattainable by direct RL, and that diffusion models inherently promote natural looking motions, aiding in sim-to-real transfer. We validate DreamControl's effectiveness on a Unitree G1 robot across a diverse set of challenging tasks involving simultaneous lower and upper body control and object interaction. Project website at https://genrobo.github.io/DreamControl/
