Table of Contents
Fetching ...

Optimizing Diffusion Noise Can Serve As Universal Motion Priors

Korrawe Karunratanakul, Konpat Preechakul, Emre Aksan, Thabo Beeler, Supasorn Suwajanakorn, Siyu Tang

TL;DR

This work introduces Diffusion Noise Optimization (DNO), a lightweight, model-agnostic approach that leverages pretrained motion diffusion priors as universal motion priors by backpropagating through the full-chain denoising to optimize the latent noise $x_T$. By solving $x_T^* = \arg\min_{x_T} \mathcal{L}(\mathrm{ODESolver}(d(\cdot), x_T))$, DNO can perform motion editing, refinement, and completion without retraining, unifying a wide range of criteria under a differentiable loss on the motion output. Empirical results on HumanML3D demonstrate that DNO achieves higher content preservation, lower jitter, and better objective adherence than competitive baselines across editing, denoising, and completion tasks, while remaining flexible to obstacle avoidance and trajectory changes. Ablations validate design choices such as gradient normalization and a decorrelation loss on latent trajectories, and the method remains efficient enough for practical use with careful solver step choices. Overall, DNO provides a versatile, task-agnostic framework for utilizing diffusion priors to solve diverse motion-related problems with minimal model management.

Abstract

We propose Diffusion Noise Optimization (DNO), a new method that effectively leverages existing motion diffusion models as motion priors for a wide range of motion-related tasks. Instead of training a task-specific diffusion model for each new task, DNO operates by optimizing the diffusion latent noise of an existing pre-trained text-to-motion model. Given the corresponding latent noise of a human motion, it propagates the gradient from the target criteria defined on the motion space through the whole denoising process to update the diffusion latent noise. As a result, DNO supports any use cases where criteria can be defined as a function of motion. In particular, we show that, for motion editing and control, DNO outperforms existing methods in both achieving the objective and preserving the motion content. DNO accommodates a diverse range of editing modes, including changing trajectory, pose, joint locations, or avoiding newly added obstacles. In addition, DNO is effective in motion denoising and completion, producing smooth and realistic motion from noisy and partial inputs. DNO achieves these results at inference time without the need for model retraining, offering great versatility for any defined reward or loss function on the motion representation.

Optimizing Diffusion Noise Can Serve As Universal Motion Priors

TL;DR

This work introduces Diffusion Noise Optimization (DNO), a lightweight, model-agnostic approach that leverages pretrained motion diffusion priors as universal motion priors by backpropagating through the full-chain denoising to optimize the latent noise . By solving , DNO can perform motion editing, refinement, and completion without retraining, unifying a wide range of criteria under a differentiable loss on the motion output. Empirical results on HumanML3D demonstrate that DNO achieves higher content preservation, lower jitter, and better objective adherence than competitive baselines across editing, denoising, and completion tasks, while remaining flexible to obstacle avoidance and trajectory changes. Ablations validate design choices such as gradient normalization and a decorrelation loss on latent trajectories, and the method remains efficient enough for practical use with careful solver step choices. Overall, DNO provides a versatile, task-agnostic framework for utilizing diffusion priors to solve diverse motion-related problems with minimal model management.

Abstract

We propose Diffusion Noise Optimization (DNO), a new method that effectively leverages existing motion diffusion models as motion priors for a wide range of motion-related tasks. Instead of training a task-specific diffusion model for each new task, DNO operates by optimizing the diffusion latent noise of an existing pre-trained text-to-motion model. Given the corresponding latent noise of a human motion, it propagates the gradient from the target criteria defined on the motion space through the whole denoising process to update the diffusion latent noise. As a result, DNO supports any use cases where criteria can be defined as a function of motion. In particular, we show that, for motion editing and control, DNO outperforms existing methods in both achieving the objective and preserving the motion content. DNO accommodates a diverse range of editing modes, including changing trajectory, pose, joint locations, or avoiding newly added obstacles. In addition, DNO is effective in motion denoising and completion, producing smooth and realistic motion from noisy and partial inputs. DNO achieves these results at inference time without the need for model retraining, offering great versatility for any defined reward or loss function on the motion representation.
Paper Structure (25 sections, 9 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 9 equations, 3 figures, 5 tables, 1 algorithm.

Figures (3)

  • Figure 1: Our proposed Diffusion Noise Optimization (DNO) can leverage the existing human motion diffusion models as universal motion priors. We demonstrate its capability in the motion editing tasks where DNO can preserve the content of the original model and accommodates a diverse range of editing modes, including changing trajectory, pose, joint location, and avoiding newly added obstacles.
  • Figure 2: Diffusion Noise Optimization (DNO).
  • Figure 3: Qualitative results from motion editing task. Each line indicates the starting and target location of the selected joint at a specific keyframe.