PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation
Hongsong Wang, Yin Zhu, Qiuxia Lai, Yang Zhang, Guo-Sen Xie, Xin Geng
TL;DR
PAMD tackles long-form, music-to-dance generation with physical plausibility by embedding three physics-aware modules into a diffusion model: Plausible Motion Constraint (PMC) using Neural Distance Fields to constrain poses to a plausible manifold, Prior Motion Guidance (PMG) using a standing pose as a lightweight auxiliary condition, and Motion Refinement with Foot-Ground Contact (MRFC) to reduce foot-skating artifacts. Conditioning on music features and a fixed prior, PAMD denoises sequences from time $t=T$ to $t=0$ with losses including ${\mathcal{L}}_{\text{recon}}$, ${\mathcal{L}}_{\text{joint}}$, ${\mathcal{L}}_{\text{vel}}$, ${\mathcal{L}}_{\text{foot}}$, and ${\mathcal{L}}_{\text{PMC}}$, and leverages classifier-free guidance to amplify conditioning. The approach enables parallel long-dance generation and achieves superior Beat Alignment Score (BAS), physical realism (PFC), and geometry-based diversity (FID$_g$, Div$_g$) on the AIST++ dataset, with user studies indicating strong perceptual preference. Ablation experiments validate the complementary roles of PMC, PMG, and MRFC, showing notable improvements when all three components are combined. This work advances practical, music-driven human motion generation by enforcing explicit physical plausibility and efficient long-horizon generation, with potential applications in automatic dance creation and editing.
Abstract
Computational dance generation is crucial in many areas, such as art, human-computer interaction, virtual reality, and digital entertainment, particularly for generating coherent and expressive long dance sequences. Diffusion-based music-to-dance generation has made significant progress, yet existing methods still struggle to produce physically plausible motions. To address this, we propose Plausibility-Aware Motion Diffusion (PAMD), a framework for generating dances that are both musically aligned and physically realistic. The core of PAMD lies in the Plausible Motion Constraint (PMC), which leverages Neural Distance Fields (NDFs) to model the actual pose manifold and guide generated motions toward a physically valid pose manifold. To provide more effective guidance during generation, we incorporate Prior Motion Guidance (PMG), which uses standing poses as auxiliary conditions alongside music features. To further enhance realism for complex movements, we introduce the Motion Refinement with Foot-ground Contact (MRFC) module, which addresses foot-skating artifacts by bridging the gap between the optimization objective in linear joint position space and the data representation in nonlinear rotation space. Extensive experiments show that PAMD significantly improves musical alignment and enhances the physical plausibility of generated motions. This project page is available at: https://mucunzhuzhu.github.io/PAMD-page/.
