Table of Contents
Fetching ...

Formalizing the Sampling Design Space of Diffusion-Based Generative Models via Adaptive Solvers and Wasserstein-Bounded Timesteps

Sangwoo Jo, Sungjoon Choi

TL;DR

This work addresses the high sampling cost of diffusion-based generators by introducing SDM, a training-free framework that jointly optimizes solver order and timestep schedules in light of the diffusion trajectory's geometry. It shows that early steps benefit from low-order solvers due to near-linear dynamics, while later steps require higher-order integration as curvature increases, and it derives a Wasserstein-bounded adaptive scheduling to tightly bound discretization error. The method combines a curvature-based adaptive solver with a Wasserstein-based timestep strategy and an $N$-step resampling mechanism, yielding state-of-the-art FID and reduced NFEs on CIFAR-10, FFHQ, AFHQv2, and ImageNet without retraining. Overall, SDM provides a principled, plug-in approach to accelerate diffusion sampling, with robust improvements across VP/VE parameterizations and datasets, and clear guidance for future sampling-efficiency research.

Abstract

Diffusion-based generative models have achieved remarkable performance across various domains, yet their practical deployment is often limited by high sampling costs. While prior work focuses on training objectives or individual solvers, the holistic design of sampling, specifically solver selection and scheduling, remains dominated by static heuristics. In this work, we revisit this challenge through a geometric lens, proposing SDM, a principled framework that aligns the numerical solver with the intrinsic properties of the diffusion trajectory. By analyzing the ODE dynamics, we show that efficient low-order solvers suffice in early high-noise stages while higher-order solvers can be progressively deployed to handle the increasing non-linearity of later stages. Furthermore, we formalize the scheduling by introducing a Wasserstein-bounded optimization framework. This method systematically derives adaptive timesteps that explicitly bound the local discretization error, ensuring the sampling process remains faithful to the underlying continuous dynamics. Without requiring additional training or architectural modifications, SDM achieves state-of-the-art performance across standard benchmarks, including an FID of 1.93 on CIFAR-10, 2.41 on FFHQ, and 1.98 on AFHQv2, with a reduced number of function evaluations compared to existing samplers. Our code is available at https://github.com/aiimaginglab/sdm.

Formalizing the Sampling Design Space of Diffusion-Based Generative Models via Adaptive Solvers and Wasserstein-Bounded Timesteps

TL;DR

This work addresses the high sampling cost of diffusion-based generators by introducing SDM, a training-free framework that jointly optimizes solver order and timestep schedules in light of the diffusion trajectory's geometry. It shows that early steps benefit from low-order solvers due to near-linear dynamics, while later steps require higher-order integration as curvature increases, and it derives a Wasserstein-bounded adaptive scheduling to tightly bound discretization error. The method combines a curvature-based adaptive solver with a Wasserstein-based timestep strategy and an -step resampling mechanism, yielding state-of-the-art FID and reduced NFEs on CIFAR-10, FFHQ, AFHQv2, and ImageNet without retraining. Overall, SDM provides a principled, plug-in approach to accelerate diffusion sampling, with robust improvements across VP/VE parameterizations and datasets, and clear guidance for future sampling-efficiency research.

Abstract

Diffusion-based generative models have achieved remarkable performance across various domains, yet their practical deployment is often limited by high sampling costs. While prior work focuses on training objectives or individual solvers, the holistic design of sampling, specifically solver selection and scheduling, remains dominated by static heuristics. In this work, we revisit this challenge through a geometric lens, proposing SDM, a principled framework that aligns the numerical solver with the intrinsic properties of the diffusion trajectory. By analyzing the ODE dynamics, we show that efficient low-order solvers suffice in early high-noise stages while higher-order solvers can be progressively deployed to handle the increasing non-linearity of later stages. Furthermore, we formalize the scheduling by introducing a Wasserstein-bounded optimization framework. This method systematically derives adaptive timesteps that explicitly bound the local discretization error, ensuring the sampling process remains faithful to the underlying continuous dynamics. Without requiring additional training or architectural modifications, SDM achieves state-of-the-art performance across standard benchmarks, including an FID of 1.93 on CIFAR-10, 2.41 on FFHQ, and 1.98 on AFHQv2, with a reduced number of function evaluations compared to existing samplers. Our code is available at https://github.com/aiimaginglab/sdm.
Paper Structure (33 sections, 4 theorems, 95 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 33 sections, 4 theorems, 95 equations, 9 figures, 5 tables, 1 algorithm.

Key Result

Theorem 3.1

Under $\text{EDM}/\text{VP}/\text{VE}$ parameterizations, the second derivative of the probability flow ODE can be expressed as follows: EDM VP VE

Figures (9)

  • Figure 1: Overview of SDM. (Inset) We introduce an adaptive timestep optimization derived from an analytical upper bound on the Wasserstein error. Here, $\eta_i$ serves as a controllable schedule that explicitly governs the allowable error budget at each step, tightening step sizes where the flow is most sensitive. (Main) Complementing this, SDM adapts the solver order to the trajectory's geometry. In the early high-noise regime (black), the flow is nearly linear, allowing efficient low-order integration. As the trajectory approaches the data manifold (blue), the curvature increases, necessitating high-order solvers. These two strategies act as independent and complementary components, jointly improving both sample quality and computational efficiency.
  • Figure 2: Relative curvature $\hat{\kappa}_{rel}$ as a function of noise level $\sigma$ for standard benchmarks. The curvature exhibits an approximately linear correlation with noise levels $\sigma$ in log scale, consistent with the theoretical derivation of second order probability flow ODE.
  • Figure 3: Distribution of local Wasserestein error bound $\eta_t$ over diffusion timesteps for ImageNet $64\times64$. EDM schedules exhibit an initially increasing trend with a subsequent decay, reaching maximum during the intermediate sampling stages. In contrast, SDM schedules allocate a larger portion of the error budget to the early high-noise stages, resulting in improved sample quality.
  • Figure 4: Ablation on threshold $\tau_k$. FID as a function of the curvature threshold $\tau_k$ for CIFAR-10 ($32\times32$, solid) and AFHQv2 ($64\times64$, dashed), evaluated under unconditional and conditional settings using the step-scheduler-based adaptive solver. Markers denote the selected $\tau_k$ values that yield the best model performance for each dataset and training configuration.
  • Figure 5: Qualitative comparison on AFHQv2 ($64\times64$) across VP (left) and VE (right) parameterizations. From top to bottom: EDM (Heun), SDM (adaptive solver), SDM (adaptive scheduling), and SDM (adaptive solver + scheduling). Each panel shows a $3 \times 5$ grid of generated samples; corresponding FID and NFE are reported below.
  • ...and 4 more figures

Theorems & Definitions (5)

  • Theorem 3.1
  • Theorem 3.2: Step Size with Local Wasserstein Bound
  • Theorem 3.3: Total Wasserstein Distance Bound
  • Proposition 3.1
  • proof