PDP: Physics-Based Character Animation via Diffusion Policy
Takara E. Truong, Michael Piseno, Zhaoming Xie, C. Karen Liu
TL;DR
PDP tackles the problem of robust, diverse physics-based character animation by integrating reinforcement learning with diffusion-based behavior cloning. It trains expert RL policies, collects noisy-state/clean-action data to build a broad, stochastic dataset, and learns a diffusion model conditioned on observations and task cues to handle multi-modality. The method achieves strong performance in perturbation recovery, universal motion tracking, and text-to-motion tasks, demonstrating robustness to both in-distribution and out-of-distribution disturbances and superior handling of language-conditioned motions. While diffusion-based control offers clear benefits for multi-modal behavior, the approach faces slower inference and trade-offs between long-horizon diversity and immediate action robustness, pointing to directions for acceleration and adaptive training strategies.
Abstract
Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multi-modal skills. However, naively training a diffusion policy often results in unstable motions for high-frequency, under-actuated control tasks like bipedal locomotion due to rapidly accumulating compounding errors, pushing the agent away from optimal training trajectories. The key idea lies in using RL policies not just for providing optimal trajectories but for providing corrective actions in sub-optimal states, giving the policy a chance to correct for errors caused by environmental stimulus, model errors, or numerical errors in simulation. Our method, Physics-Based Character Animation via Diffusion Policy (PDP), combines reinforcement learning (RL) and behavior cloning (BC) to create a robust diffusion policy for physics-based character animation. We demonstrate PDP on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis.
