Table of Contents
Fetching ...

Faster Diffusion Models via Higher-Order Approximation

Gen Li, Yuchen Zhou, Yuting Wei, Yuxin Chen

TL;DR

The paper tackles slow sampling in diffusion models by proposing HEROISM, a training-free, high-order ODE-based sampler that uses Lagrange interpolation over multiple time points and a successive-refinement scheme. It proves provable convergence in total variation to the target distribution under mild assumptions and shows a favorable iteration complexity scaling as $\tilde{O}(d^{1+2/K}/\varepsilon^{1/K})$ for fixed $K$, with near-linear dependence on the data dimension when $K$ is held constant. The theory accommodates inexact score estimates, revealing graceful degradation with score/Jacobian errors, and improves upon prior deterministic and stochastic acceleration results by relaxing smoothness and log-concavity requirements and reducing dependence on the data support size. Overall, the work provides a solid theoretical framework for higher-order accelerated samplers in diffusion models and suggests directions for scaling and stochastic extensions.

Abstract

In this paper, we explore provable acceleration of diffusion models without any additional retraining. Focusing on the task of approximating a target data distribution in $\mathbb{R}^d$ to within $\varepsilon$ total-variation distance, we propose a principled, training-free sampling algorithm that requires only the order of $$ d^{1+2/K} \varepsilon^{-1/K} $$ score function evaluations (up to log factor) in the presence of accurate scores, where $K>0$ is an arbitrary fixed integer. This result applies to a broad class of target data distributions, without the need for assumptions such as smoothness or log-concavity. Our theory is robust vis-a-vis inexact score estimation, degrading gracefully as the score estimation error increases -- without demanding higher-order smoothness on the score estimates as assumed in previous work. The proposed algorithm draws insight from high-order ODE solvers, leveraging high-order Lagrange interpolation and successive refinement to approximate the integral derived from the probability flow ODE. More broadly, our work develops a theoretical framework towards understanding the efficacy of high-order methods for accelerated sampling.

Faster Diffusion Models via Higher-Order Approximation

TL;DR

The paper tackles slow sampling in diffusion models by proposing HEROISM, a training-free, high-order ODE-based sampler that uses Lagrange interpolation over multiple time points and a successive-refinement scheme. It proves provable convergence in total variation to the target distribution under mild assumptions and shows a favorable iteration complexity scaling as for fixed , with near-linear dependence on the data dimension when is held constant. The theory accommodates inexact score estimates, revealing graceful degradation with score/Jacobian errors, and improves upon prior deterministic and stochastic acceleration results by relaxing smoothness and log-concavity requirements and reducing dependence on the data support size. Overall, the work provides a solid theoretical framework for higher-order accelerated samplers in diffusion models and suggests directions for scaling and stochastic extensions.

Abstract

In this paper, we explore provable acceleration of diffusion models without any additional retraining. Focusing on the task of approximating a target data distribution in to within total-variation distance, we propose a principled, training-free sampling algorithm that requires only the order of score function evaluations (up to log factor) in the presence of accurate scores, where is an arbitrary fixed integer. This result applies to a broad class of target data distributions, without the need for assumptions such as smoothness or log-concavity. Our theory is robust vis-a-vis inexact score estimation, degrading gracefully as the score estimation error increases -- without demanding higher-order smoothness on the score estimates as assumed in previous work. The proposed algorithm draws insight from high-order ODE solvers, leveraging high-order Lagrange interpolation and successive refinement to approximate the integral derived from the probability flow ODE. More broadly, our work develops a theoretical framework towards understanding the efficacy of high-order methods for accelerated sampling.

Paper Structure

This paper contains 46 sections, 12 theorems, 180 equations, 1 algorithm.

Key Result

Theorem 1

Suppose that Assumptions assump:data_distribution, assump:score_estimation_1 and assump:score_estimation_2 hold. Let $K>0$ be an arbitrary fixed integer. If $T \geq C_2d^2\log^3T$ and $N = \lceil C_3\log T\rceil$ for some large enough constants $C_2, C_3 > 0$, then Algorithm algorithm:high_order ach

Theorems & Definitions (16)

  • Remark 1: The time indices $t$ vs. $\tau$
  • Remark 2
  • Remark 3
  • Theorem 1
  • Lemma 1
  • Lemma 2: Lemma 1 in li2024sharp
  • Lemma 3
  • Lemma 4
  • Lemma 5: Lemma 3 in li2024sharp
  • Lemma 6
  • ...and 6 more