Formalizing the Sampling Design Space of Diffusion-Based Generative Models via Adaptive Solvers and Wasserstein-Bounded Timesteps
Sangwoo Jo, Sungjoon Choi
TL;DR
This work addresses the high sampling cost of diffusion-based generators by introducing SDM, a training-free framework that jointly optimizes solver order and timestep schedules in light of the diffusion trajectory's geometry. It shows that early steps benefit from low-order solvers due to near-linear dynamics, while later steps require higher-order integration as curvature increases, and it derives a Wasserstein-bounded adaptive scheduling to tightly bound discretization error. The method combines a curvature-based adaptive solver with a Wasserstein-based timestep strategy and an $N$-step resampling mechanism, yielding state-of-the-art FID and reduced NFEs on CIFAR-10, FFHQ, AFHQv2, and ImageNet without retraining. Overall, SDM provides a principled, plug-in approach to accelerate diffusion sampling, with robust improvements across VP/VE parameterizations and datasets, and clear guidance for future sampling-efficiency research.
Abstract
Diffusion-based generative models have achieved remarkable performance across various domains, yet their practical deployment is often limited by high sampling costs. While prior work focuses on training objectives or individual solvers, the holistic design of sampling, specifically solver selection and scheduling, remains dominated by static heuristics. In this work, we revisit this challenge through a geometric lens, proposing SDM, a principled framework that aligns the numerical solver with the intrinsic properties of the diffusion trajectory. By analyzing the ODE dynamics, we show that efficient low-order solvers suffice in early high-noise stages while higher-order solvers can be progressively deployed to handle the increasing non-linearity of later stages. Furthermore, we formalize the scheduling by introducing a Wasserstein-bounded optimization framework. This method systematically derives adaptive timesteps that explicitly bound the local discretization error, ensuring the sampling process remains faithful to the underlying continuous dynamics. Without requiring additional training or architectural modifications, SDM achieves state-of-the-art performance across standard benchmarks, including an FID of 1.93 on CIFAR-10, 2.41 on FFHQ, and 1.98 on AFHQv2, with a reduced number of function evaluations compared to existing samplers. Our code is available at https://github.com/aiimaginglab/sdm.
