Table of Contents
Fetching ...

PDP: Physics-Based Character Animation via Diffusion Policy

Takara E. Truong, Michael Piseno, Zhaoming Xie, C. Karen Liu

TL;DR

PDP tackles the problem of robust, diverse physics-based character animation by integrating reinforcement learning with diffusion-based behavior cloning. It trains expert RL policies, collects noisy-state/clean-action data to build a broad, stochastic dataset, and learns a diffusion model conditioned on observations and task cues to handle multi-modality. The method achieves strong performance in perturbation recovery, universal motion tracking, and text-to-motion tasks, demonstrating robustness to both in-distribution and out-of-distribution disturbances and superior handling of language-conditioned motions. While diffusion-based control offers clear benefits for multi-modal behavior, the approach faces slower inference and trade-offs between long-horizon diversity and immediate action robustness, pointing to directions for acceleration and adaptive training strategies.

Abstract

Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multi-modal skills. However, naively training a diffusion policy often results in unstable motions for high-frequency, under-actuated control tasks like bipedal locomotion due to rapidly accumulating compounding errors, pushing the agent away from optimal training trajectories. The key idea lies in using RL policies not just for providing optimal trajectories but for providing corrective actions in sub-optimal states, giving the policy a chance to correct for errors caused by environmental stimulus, model errors, or numerical errors in simulation. Our method, Physics-Based Character Animation via Diffusion Policy (PDP), combines reinforcement learning (RL) and behavior cloning (BC) to create a robust diffusion policy for physics-based character animation. We demonstrate PDP on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis.

PDP: Physics-Based Character Animation via Diffusion Policy

TL;DR

PDP tackles the problem of robust, diverse physics-based character animation by integrating reinforcement learning with diffusion-based behavior cloning. It trains expert RL policies, collects noisy-state/clean-action data to build a broad, stochastic dataset, and learns a diffusion model conditioned on observations and task cues to handle multi-modality. The method achieves strong performance in perturbation recovery, universal motion tracking, and text-to-motion tasks, demonstrating robustness to both in-distribution and out-of-distribution disturbances and superior handling of language-conditioned motions. While diffusion-based control offers clear benefits for multi-modal behavior, the approach faces slower inference and trade-offs between long-horizon diversity and immediate action robustness, pointing to directions for acceleration and adaptive training strategies.

Abstract

Generating diverse and realistic human motion that can physically interact with an environment remains a challenging research area in character animation. Meanwhile, diffusion-based methods, as proposed by the robotics community, have demonstrated the ability to capture highly diverse and multi-modal skills. However, naively training a diffusion policy often results in unstable motions for high-frequency, under-actuated control tasks like bipedal locomotion due to rapidly accumulating compounding errors, pushing the agent away from optimal training trajectories. The key idea lies in using RL policies not just for providing optimal trajectories but for providing corrective actions in sub-optimal states, giving the policy a chance to correct for errors caused by environmental stimulus, model errors, or numerical errors in simulation. Our method, Physics-Based Character Animation via Diffusion Policy (PDP), combines reinforcement learning (RL) and behavior cloning (BC) to create a robust diffusion policy for physics-based character animation. We demonstrate PDP on perturbation recovery, universal motion tracking, and physics-based text-to-motion synthesis.
Paper Structure (34 sections, 3 equations, 3 figures, 4 tables)

This paper contains 34 sections, 3 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: PDP Overview. Top: First, we train expert RL policies $\mathcal{\pi}_{\mathcal{T}_i}$ on tasks $\mathcal{T}_i$. We use $\mathcal{\pi}_{\mathcal{T}_i}$ to create a dataset of noisy-state clean-actions. We then use BC to train a diffusion model. Bottom: Our model is a transformer encoder-decoder architecture. Block-B is used for text-conditioned applications, while other applications use Block A. Note that these applications are trained separately on their own distilled dataset.
  • Figure 2: PDP rollouts for a 15% body weight perturbation where the white pelvis and arrows indicate where the force is applied and the direction, respectively. Each row demonstrates a unique mode of recovery from the same perturbation.
  • Figure 3: Global left foot contact positions after $15\%$ body weight perturbation in meters. +Y and +X align with the character's forward and right directions, respectively. The different colored arrows represent the directions that the force is applied on the person. The shaded areas represent foot contacts in the training distribution with noise level 0.12. The black X's represent the ground truth foot contacts of the human participant. All policies were trained on the stochastic dataset with noise level 0.12.