TADA: Improved Diffusion Sampling with Training-free Augmented Dynamics
Tianrong Chen, Huangjie Zheng, David Berthelot, Jiatao Gu, Josh Susskind, Shuangfei Zhai
TL;DR
Diffusion models remain limited by slow sampling, which motivates training-free acceleration strategies. The authors introduce Training-free Augmented DynAmics (TADA), a momentum-diffusion-based approach that uses higher-dimensional initial noise to enable faster sampling with pretrained models via an ODE solver, while offering a tunable detail control at no extra cost. They prove a training-equivalence between momentum diffusion and conventional diffusion, enabling direct reuse of pretrained models, and demonstrate strong, consistent gains on EDM/EDM2 and Stable Diffusion 3 across ImageNet benchmarks with up to 186% speedups. Empirical results show improved FID/FD-DINOv2 scores across NFEs, with qualitative improvements at lower CFG and under-parameterized regimes; limitations include incomplete disentanglement of augmentation and dynamics and diminishing gains for high-capacity models, guiding future work toward advanced solvers and stochasticity control.
Abstract
Diffusion models have demonstrated exceptional capabilities in generating high-fidelity images but typically suffer from inefficient sampling. Many solver designs and noise scheduling strategies have been proposed to dramatically improve sampling speeds. In this paper, we introduce a new sampling method that is up to $186\%$ faster than the current state of the art solver for comparative FID on ImageNet512. This new sampling method is training-free and uses an ordinary differential equation (ODE) solver. The key to our method resides in using higher-dimensional initial noise, allowing to produce more detailed samples with less function evaluations from existing pretrained diffusion models. In addition, by design our solver allows to control the level of detail through a simple hyper-parameter at no extra computational cost. We present how our approach leverages momentum dynamics by establishing a fundamental equivalence between momentum diffusion models and conventional diffusion models with respect to their training paradigms. Moreover, we observe the use of higher-dimensional noise naturally exhibits characteristics similar to stochastic differential equations (SDEs). Finally, we demonstrate strong performances on a set of representative pretrained diffusion models, including EDM, EDM2, and Stable-Diffusion 3, which cover models in both pixel and latent spaces, as well as class and text conditional settings. The code is available at https://github.com/apple/ml-tada.
