PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
Yingjie Xi, Jian Jun Zhang, Xiaosong Yang
TL;DR
The paper tackles the challenge of generating controllable, high-fidelity human motion that adheres to both global trajectories and fine-grained postures. It introduces ProMoGen, a diffusion-based framework that decouples trajectory guidance from local anchor poses via a Trajectory Encoder and an Anchor Motion Encoder, coupled through an Initial Motion Generator and a Refinement Module. To stabilize learning with sparse anchor guidance, it proposes SAP-CL, a curriculum that progressively reduces anchor density across stages and employs a Filtering Module to sample anchors. Experiments on HumanML3D and CombatMotion demonstrate state-of-the-art performance across metrics such as MPJPE and FID, confirming improved controllability, fidelity, and efficiency over baseline methods.
Abstract
In computer animation, game design, and human-computer interaction, synthesizing human motion that aligns with user intent remains a significant challenge. Existing methods have notable limitations: textual approaches offer high-level semantic guidance but struggle to describe complex actions accurately; trajectory-based techniques provide intuitive global motion direction yet often fall short in generating precise or customized character movements; and anchor poses-guided methods are typically confined to synthesize only simple motion patterns. To generate more controllable and precise human motions, we propose \textbf{ProMoGen (Progressive Motion Generation)}, a novel framework that integrates trajectory guidance with sparse anchor motion control. Global trajectories ensure consistency in spatial direction and displacement, while sparse anchor motions only deliver precise action guidance without displacement. This decoupling enables independent refinement of both aspects, resulting in a more controllable, high-fidelity, and sophisticated motion synthesis. ProMoGen supports both dual and single control paradigms within a unified training process. Moreover, we recognize that direct learning from sparse motions is inherently unstable, we introduce \textbf{SAP-CL (Sparse Anchor Posture Curriculum Learning)}, a curriculum learning strategy that progressively adjusts the number of anchors used for guidance, thereby enabling more precise and stable convergence. Extensive experiments demonstrate that ProMoGen excels in synthesizing vivid and diverse motions guided by predefined trajectory and arbitrary anchor frames. Our approach seamlessly integrates personalized motion with structured guidance, significantly outperforming state-of-the-art methods across multiple control scenarios.
