Table of Contents
Fetching ...

Accelerating Diffusion Sampling with Optimized Time Steps

Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, Zhenguo Li

TL;DR

The paper tackles the bottleneck of sampling efficiency in diffusion probabilistic models by optimizing the time-step schedule for high-order ODE solvers. It presents a general, training-free framework that formulates an optimization over the sampling times, solvable efficiently with a constrained trust-region method, and applicable to solvers like UniPC and DPM-Solver++. The approach yields substantial FID improvements across pixel-space and latent-space diffusion models on CIFAR-10, ImageNet, FFHQ, and AFHQv2, even when using as few as five neural function evaluations. This work enables faster, high-quality diffusion sampling with minimal overhead and broad plug-and-play applicability, making diffusion-based generation more practical for real-world deployment.

Abstract

Diffusion probabilistic models (DPMs) have shown remarkable performance in high-resolution image synthesis, but their sampling efficiency is still to be desired due to the typically large number of sampling steps. Recent advancements in high-order numerical ODE solvers for DPMs have enabled the generation of high-quality images with much fewer sampling steps. While this is a significant development, most sampling methods still employ uniform time steps, which is not optimal when using a small number of steps. To address this issue, we propose a general framework for designing an optimization problem that seeks more appropriate time steps for a specific numerical ODE solver for DPMs. This optimization problem aims to minimize the distance between the ground-truth solution to the ODE and an approximate solution corresponding to the numerical solver. It can be efficiently solved using the constrained trust region method, taking less than $15$ seconds. Our extensive experiments on both unconditional and conditional sampling using pixel- and latent-space DPMs demonstrate that, when combined with the state-of-the-art sampling method UniPC, our optimized time steps significantly improve image generation performance in terms of FID scores for datasets such as CIFAR-10 and ImageNet, compared to using uniform time steps.

Accelerating Diffusion Sampling with Optimized Time Steps

TL;DR

The paper tackles the bottleneck of sampling efficiency in diffusion probabilistic models by optimizing the time-step schedule for high-order ODE solvers. It presents a general, training-free framework that formulates an optimization over the sampling times, solvable efficiently with a constrained trust-region method, and applicable to solvers like UniPC and DPM-Solver++. The approach yields substantial FID improvements across pixel-space and latent-space diffusion models on CIFAR-10, ImageNet, FFHQ, and AFHQv2, even when using as few as five neural function evaluations. This work enables faster, high-quality diffusion sampling with minimal overhead and broad plug-and-play applicability, making diffusion-based generation more practical for real-world deployment.

Abstract

Diffusion probabilistic models (DPMs) have shown remarkable performance in high-resolution image synthesis, but their sampling efficiency is still to be desired due to the typically large number of sampling steps. Recent advancements in high-order numerical ODE solvers for DPMs have enabled the generation of high-quality images with much fewer sampling steps. While this is a significant development, most sampling methods still employ uniform time steps, which is not optimal when using a small number of steps. To address this issue, we propose a general framework for designing an optimization problem that seeks more appropriate time steps for a specific numerical ODE solver for DPMs. This optimization problem aims to minimize the distance between the ground-truth solution to the ODE and an approximate solution corresponding to the numerical solver. It can be efficiently solved using the constrained trust region method, taking less than seconds. Our extensive experiments on both unconditional and conditional sampling using pixel- and latent-space DPMs demonstrate that, when combined with the state-of-the-art sampling method UniPC, our optimized time steps significantly improve image generation performance in terms of FID scores for datasets such as CIFAR-10 and ImageNet, compared to using uniform time steps.
Paper Structure (25 sections, 2 theorems, 26 equations, 8 figures, 9 tables, 1 algorithm)

This paper contains 25 sections, 2 theorems, 26 equations, 8 figures, 9 tables, 1 algorithm.

Key Result

Lemma 1

For any $\mathbf{x}_0 \sim q_0$ and $P_0 \in (0,1)$, with probability at least $1-P_0$, the following event occurs: For all $t \in \{t_0, t_1,\ldots,t_N\}$ and $\mathbf{x}_t \sim q_t$, we have where $\tilde{\eta} := \sqrt{\frac{N+1}{P_0}} \eta$ and $\tilde{\varepsilon}_t := \frac{\varepsilon_t\sigma_t^2}{\alpha_t}$.

Figures (8)

  • Figure 1: Sampling quality measured by FID ($\downarrow$) of different discretization schemes of time steps for UniPC zhao2023unipc with varying NFEs on various DPMs and datasets.
  • Figure 2: Generated images by UniPC zhao2023unipc with only 5 NFEs for various discretization schemes of time steps from DiT-XL-2 ImageNet 256x256 model peebles2022scalable (with cfg scale $s=1.5$ and the same random seed).
  • Figure 3: Sampling quality measured by FID ($\downarrow$) of different discretization schemes of time steps for UniPC zhao2023unipc with varying NFEs on MS-COCO 256x256 using PixArt-$\alpha$-256 model chen2023pixartalpha (with cfg scale $s=2.5$).
  • Figure 4: Generated images by UniPC zhao2023unipc with only 5 NFEs for various discretization schemes of time steps from DiT-XL-2 ImageNet 256x256 model peebles2022scalable (with cfg scale $s=1.5$ and the same random seed).
  • Figure 5: Generated images by UniPC zhao2023unipc with only 5 NFEs for various discretization schemes of time steps from PixArt-$\alpha$-512 model chen2023pixartalpha (with cfg scale $s=2.5$ and the same random seed).
  • ...and 3 more figures

Theorems & Definitions (6)

  • Remark 1
  • Remark 2
  • Lemma 1
  • proof
  • Theorem 1
  • proof