Table of Contents
Fetching ...

Restart Sampling for Improving Generative Processes

Yilun Xu, Mingyang Deng, Xiang Cheng, Yonglong Tian, Ziming Liu, Tommi Jaakkola

TL;DR

Restart sampling introduces a novel forward-backward cycling process that injects substantial noise into a forward pass and then follows a backward ODE, repeating this cycle to amplify contraction of accumulated errors while preserving ODE-level discretization accuracy. The method yields tighter Wasserstein-based error bounds than traditional ODE or SDE samplers and empirically outperforms both in speed and quality across CIFAR-10, ImageNet 64×64, and large-scale text-to-image tasks, including Stable Diffusion. The work provides theoretical insights into contraction effects, practical guidelines for parameter choices, and evidence of improved text-image alignment and diversity. These results suggest Restart as a versatile, efficient sampler for diffusion-model families and related differential-equation-based generative models.

Abstract

Generative processes that involve solving differential equations, such as diffusion models, frequently necessitate balancing speed and quality. ODE-based samplers are fast but plateau in performance while SDE-based samplers deliver higher sample quality at the cost of increased sampling time. We attribute this difference to sampling errors: ODE-samplers involve smaller discretization errors while stochasticity in SDE contracts accumulated errors. Based on these findings, we propose a novel sampling algorithm called Restart in order to better balance discretization errors and contraction. The sampling method alternates between adding substantial noise in additional forward steps and strictly following a backward ODE. Empirically, Restart sampler surpasses previous SDE and ODE samplers in both speed and accuracy. Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet $64 \times 64$. In addition, it attains significantly better sample quality than ODE samplers within comparable sampling times. Moreover, Restart better balances text-image alignment/visual quality versus diversity than previous samplers in the large-scale text-to-image Stable Diffusion model pre-trained on LAION $512 \times 512$. Code is available at https://github.com/Newbeeer/diffusion_restart_sampling

Restart Sampling for Improving Generative Processes

TL;DR

Restart sampling introduces a novel forward-backward cycling process that injects substantial noise into a forward pass and then follows a backward ODE, repeating this cycle to amplify contraction of accumulated errors while preserving ODE-level discretization accuracy. The method yields tighter Wasserstein-based error bounds than traditional ODE or SDE samplers and empirically outperforms both in speed and quality across CIFAR-10, ImageNet 64×64, and large-scale text-to-image tasks, including Stable Diffusion. The work provides theoretical insights into contraction effects, practical guidelines for parameter choices, and evidence of improved text-image alignment and diversity. These results suggest Restart as a versatile, efficient sampler for diffusion-model families and related differential-equation-based generative models.

Abstract

Generative processes that involve solving differential equations, such as diffusion models, frequently necessitate balancing speed and quality. ODE-based samplers are fast but plateau in performance while SDE-based samplers deliver higher sample quality at the cost of increased sampling time. We attribute this difference to sampling errors: ODE-samplers involve smaller discretization errors while stochasticity in SDE contracts accumulated errors. Based on these findings, we propose a novel sampling algorithm called Restart in order to better balance discretization errors and contraction. The sampling method alternates between adding substantial noise in additional forward steps and strictly following a backward ODE. Empirically, Restart sampler surpasses previous SDE and ODE samplers in both speed and accuracy. Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet . In addition, it attains significantly better sample quality than ODE samplers within comparable sampling times. Moreover, Restart better balances text-image alignment/visual quality versus diversity than previous samplers in the large-scale text-to-image Stable Diffusion model pre-trained on LAION . Code is available at https://github.com/Newbeeer/diffusion_restart_sampling
Paper Structure (37 sections, 11 theorems, 73 equations, 15 figures, 10 tables, 3 algorithms)

This paper contains 37 sections, 11 theorems, 73 equations, 15 figures, 10 tables, 3 algorithms.

Key Result

Theorem 1

Let $t_{\textrm{max}}$ be the initial noise level and $p_t$ denote the true distribution at noise level $t$. Let $p^{{\textrm{ODE}_\theta}}_t, p^{{\textrm{SDE}_\theta}}_{t}$ denote the distributions of simulating ${\textrm{ODE}_\theta}$, ${\textrm{SDE}_\theta}$ respectively. Assume that $\forall t \ In the above, $U=BL_1/t_{\textrm{min}} +L_1^2 t_{\textrm{max}}^2 / t_{\textrm{min}}^2$, $\lambda <

Figures (15)

  • Figure 1: (a) Illustration of the implementation of drift and noise terms in ODE, SDE, and Restart. (b) Sample quality versus number of function evaluations (NFE) for different approaches. ODE (Green) provides fast speeds but attains only mediocre quality, even with a large NFE. SDE (Yellow) obtains good sample quality but necessitates substantial sampling time. In contrast to ODE and SDE, which have their own winning regions, Restart (Red) achieves the best quality across all NFEs.
  • Figure 1: Uncond. CIFAR-10 with EDM and PFGM++
  • Figure 2: Additional sampling error versus (a) contracted error, where the Pareto frontier is plotted and (b) total error, where the scatter plot is provided. (c) Pareto frontier of NFE versus total error.
  • Figure 3: FID versus NFE on (a) unconditional generation on CIFAR-10 with VP; (b) class-conditional generation on ImageNet with EDM.
  • Figure 4: CIFAR-10, VP, in the low NFE regime. Restart consistently outperforms the DPM-solver with an NFE ranging from 16 to 36.
  • ...and 10 more figures

Theorems & Definitions (21)

  • Theorem 1: Informal
  • Theorem 2: Informal
  • Theorem 3
  • proof
  • Theorem 4
  • proof
  • Lemma 1
  • proof
  • Lemma 2: Discretization bound for ODE
  • proof
  • ...and 11 more