Table of Contents
Fetching ...

Few-Step Diffusion Sampling Through Instance-Aware Discretizations

Liangyu Yuan, Ruoyu Wang, Tong Zhao, Dingwen Fu, Mingkun Lei, Beier Zhu, Chi Zhang

Abstract

Diffusion and flow matching models generate high-fidelity data by simulating paths defined by Ordinary or Stochastic Differential Equations (ODEs/SDEs), starting from a tractable prior distribution. The probability flow ODE formulation enables the use of advanced numerical solvers to accelerate sampling. Orthogonal yet vital to solver design is the discretization strategy. While early approaches employed handcrafted heuristics and recent methods adopt optimization-based techniques, most existing strategies enforce a globally shared timestep schedule across all samples. This uniform treatment fails to account for instance-specific complexity in the generative process, potentially limiting performance. Motivated by controlled experiments on synthetic data, which reveals the suboptimality of global schedules under instance-specific dynamics, we propose an instance-aware discretization framework. Our method learns to adapt timestep allocations based on input-dependent priors, extending gradient-based discretization search to the conditional generative setting. Empirical results across diverse settings, including synthetic data, pixel-space diffusion, latent-space images and video flow matching models, demonstrate that our method consistently improves generation quality with marginal tuning cost compared to training and negligible inference overhead.

Few-Step Diffusion Sampling Through Instance-Aware Discretizations

Abstract

Diffusion and flow matching models generate high-fidelity data by simulating paths defined by Ordinary or Stochastic Differential Equations (ODEs/SDEs), starting from a tractable prior distribution. The probability flow ODE formulation enables the use of advanced numerical solvers to accelerate sampling. Orthogonal yet vital to solver design is the discretization strategy. While early approaches employed handcrafted heuristics and recent methods adopt optimization-based techniques, most existing strategies enforce a globally shared timestep schedule across all samples. This uniform treatment fails to account for instance-specific complexity in the generative process, potentially limiting performance. Motivated by controlled experiments on synthetic data, which reveals the suboptimality of global schedules under instance-specific dynamics, we propose an instance-aware discretization framework. Our method learns to adapt timestep allocations based on input-dependent priors, extending gradient-based discretization search to the conditional generative setting. Empirical results across diverse settings, including synthetic data, pixel-space diffusion, latent-space images and video flow matching models, demonstrate that our method consistently improves generation quality with marginal tuning cost compared to training and negligible inference overhead.
Paper Structure (29 sections, 15 equations, 20 figures, 19 tables, 1 algorithm)

This paper contains 29 sections, 15 equations, 20 figures, 19 tables, 1 algorithm.

Figures (20)

  • Figure 1: Our effective instance-aware discretization improves sampling quality, by generating a tailored discretization $\xi^\phi$ for each initial noise $\mathbf{x}_T$ and condition $\mathbf{c}$, outperforming heuristic and globally optimized schedules. Orange contour represents the ground truth data distribution, blue dots represent the generated samples across different discretizations. ( $\Psi(\cdot,\cdot,\cdot)$ represents the ODE path.)
  • Figure 2: Comparison of endpoint(accumulated) errors for different timestep strategies (NFE=3). Each point is an initial noise sample $\mathbf{x}_T\sim\mathcal{N}(0,\sigma_T^2\mathbf{I})$, colored by L2 error relative to 100-step euler as the ground truth. Methods: (a) Uniform timesteps. (b) Globally optimized timesteps. (c) Instance-specific timesteps (overfitted). (d) Instance-specific timesteps (learned through network $\phi$).
  • Figure 3: Quantitative comparison on synthetic experiments, evaluating MSE to teacher samples, KL divergence, and Wasserstein distance across various NFEs(log-scale). Methods include: uniform heuristics, globally optimized timesteps, and our proposed instance-level optimized timesteps conditioned on prior sample.
  • Figure 4: Architectural design of the proposed lightweight prior conditioning network. When conditional information is available, class indices are first scaled by a factor of $\frac{1}{\sqrt{\text{label\_dim}}}$ and then processed through a linear layer. For prompt embeddings (FLUX.1-dev), T5 embeddings undergo mean pooling to reduce dimensionality before being concatenated with CLIP embeddings.
  • Figure 5: Ablation study on FFHQ, LSUN-Bedroom and FLUX.1-dev, instance condition is observed to be the most contributing factor, while the effect of shifted factors varies across pretrained models.
  • ...and 15 more figures