SPG: Improving Motion Diffusion by Smooth Perturbation Guidance
Boseong Jeon
TL;DR
SPG tackles the problem of improving motion diffusion outputs without retraining by introducing test-time, model-agnostic weak-model guidance. It builds an aligned weak term through temporal smoothing of the predicted motion, integrated into the denoising process to enhance fidelity while preserving motion structure. Across diverse architectures and tasks, SPG achieves state-of-the-art fidelity, often outperforming CFG in isolation and complementing CFG when combined. The method is simple to implement, requires minimal code changes, and broadens the applicability of diffusion-based motion generation with improved realism and reduced foot-skating, albeit at the cost of extra evaluation time and potential abrupt transitions in some cases.
Abstract
This paper presents a test-time guidance method to improve the output quality of the human motion diffusion models without requiring additional training. To have negative guidance, Smooth Perturbation Guidance (SPG) builds a weak model by temporally smoothing the motion in the denoising steps. Compared to model-agnostic methods originating from the image generation field, SPG effectively mitigates out-of-distribution issues when perturbing motion diffusion models. In SPG guidance, the nature of motion structure remains intact. This work conducts a comprehensive analysis across distinct model architectures and tasks. Despite its extremely simple implementation and no need for additional training requirements, SPG consistently enhances motion fidelity. Project page can be found at https://spg-blind.vercel.app/
