Scalable Motion In-betweening via Diffusion and Physics-Based Character Adaptation
Jia Qin
TL;DR
This work tackles scalable motion in-betweening across characters with varying skeletons by integrating diffusion-based generation with physics-based adaptation. It introduces a two-stage pipeline: Stage 1 uses a character-agnostic diffusion model to synthesize transitions on a canonical skeleton from sparse keyframes; Stage 2 trains a character-specific RL controller to adapt the canonical motion to each character’s morphology and dynamics, enforcing physical plausibility and stylistic realism. The approach enables cross-skeleton generalization without retraining the diffusion model and reduces artifacts such as foot sliding and mesh interpenetration. Experiments on LAFAN1 and stylized characters show that the method produces physically plausible, style-consistent motions under both sparse and long-range constraints, with favorable K-FID and K-Error trade-offs.
Abstract
We propose a two-stage framework for motion in-betweening that combines diffusion-based motion generation with physics-based character adaptation. In Stage 1, a character-agnostic diffusion model synthesizes transitions from sparse keyframes on a canonical skeleton, allowing the same model to generalize across diverse characters. In Stage 2, a reinforcement learning-based controller adapts the canonical motion to the target character's morphology and dynamics, correcting artifacts and enhancing stylistic realism. This design supports scalable motion generation across characters with diverse skeletons without retraining the entire model. Experiments on standard benchmarks and stylized characters demonstrate that our method produces physically plausible, style-consistent motions under sparse and long-range constraints.
