Table of Contents
Fetching ...

An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models

Yuang Wang, Pengfei Jin, Li Zhang, Quanzheng Li, Zhiqiang Chen, Dufan Wu

TL;DR

Diffusion bridge models initialize from corrupted images yet typically rely on SDE samplers, which can slow conditional generation. The authors propose a high-order ODE sampler with a stochastic start (ODES3) that uses posterior sampling to bypass the PF-ODE start singularity, followed by Heun's second-order integration to rapidly solve the PF-ODE, achieving high perceptual quality with fewer neural function evaluations. The method is training-free and compatible with pretrained diffusion-bridge models, with extensive experiments on image restoration and translation showing state-of-the-art FID and visual quality improvements. This work provides a practical route to faster, more accurate conditional generation in diffusion-bridge setups and motivates exploring additional high-order ODE solvers in future research.

Abstract

Diffusion bridge models have demonstrated promising performance in conditional image generation tasks, such as image restoration and translation, by initializing the generative process from corrupted images instead of pure Gaussian noise. However, existing diffusion bridge models often rely on Stochastic Differential Equation (SDE) samplers, which result in slower inference speed compared to diffusion models that employ high-order Ordinary Differential Equation (ODE) solvers for acceleration. To mitigate this gap, we propose a high-order ODE sampler with a stochastic start for diffusion bridge models. To overcome the singular behavior of the probability flow ODE (PF-ODE) at the beginning of the reverse process, a posterior sampling approach was introduced at the first reverse step. The sampling was designed to ensure a smooth transition from corrupted images to the generative trajectory while reducing discretization errors. Following this stochastic start, Heun's second-order solver is applied to solve the PF-ODE, achieving high perceptual quality with significantly reduced neural function evaluations (NFEs). Our method is fully compatible with pretrained diffusion bridge models and requires no additional training. Extensive experiments on image restoration and translation tasks, including super-resolution, JPEG restoration, Edges-to-Handbags, and DIODE-Outdoor, demonstrated that our sampler outperforms state-of-the-art methods in both visual quality and Frechet Inception Distance (FID).

An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models

TL;DR

Diffusion bridge models initialize from corrupted images yet typically rely on SDE samplers, which can slow conditional generation. The authors propose a high-order ODE sampler with a stochastic start (ODES3) that uses posterior sampling to bypass the PF-ODE start singularity, followed by Heun's second-order integration to rapidly solve the PF-ODE, achieving high perceptual quality with fewer neural function evaluations. The method is training-free and compatible with pretrained diffusion-bridge models, with extensive experiments on image restoration and translation showing state-of-the-art FID and visual quality improvements. This work provides a practical route to faster, more accurate conditional generation in diffusion-bridge setups and motivates exploring additional high-order ODE solvers in future research.

Abstract

Diffusion bridge models have demonstrated promising performance in conditional image generation tasks, such as image restoration and translation, by initializing the generative process from corrupted images instead of pure Gaussian noise. However, existing diffusion bridge models often rely on Stochastic Differential Equation (SDE) samplers, which result in slower inference speed compared to diffusion models that employ high-order Ordinary Differential Equation (ODE) solvers for acceleration. To mitigate this gap, we propose a high-order ODE sampler with a stochastic start for diffusion bridge models. To overcome the singular behavior of the probability flow ODE (PF-ODE) at the beginning of the reverse process, a posterior sampling approach was introduced at the first reverse step. The sampling was designed to ensure a smooth transition from corrupted images to the generative trajectory while reducing discretization errors. Following this stochastic start, Heun's second-order solver is applied to solve the PF-ODE, achieving high perceptual quality with significantly reduced neural function evaluations (NFEs). Our method is fully compatible with pretrained diffusion bridge models and requires no additional training. Extensive experiments on image restoration and translation tasks, including super-resolution, JPEG restoration, Edges-to-Handbags, and DIODE-Outdoor, demonstrated that our sampler outperforms state-of-the-art methods in both visual quality and Frechet Inception Distance (FID).
Paper Structure (15 sections, 3 theorems, 29 equations, 5 figures, 3 tables)

This paper contains 15 sections, 3 theorems, 29 equations, 5 figures, 3 tables.

Key Result

Theorem 1

At $t=T$, the non-linear drift term in the reverse SDE (eq:reverse_sde) is well defined. Specifically, where the expected mean $\hat{X}_0^{\left(T\right)}$ is defined as:

Figures (5)

  • Figure 1: Overview of the proposed ODE sampler with a stochastic start for diffusion bridge models. The forward SDE maps the conditional data distribution $q_{\text{data}}\left(X_0|y\right)$ to the Dirac distribution centered at the corrupted image $y$. In the reverse process, $X_T$ is initialized as $y$, posterior sampling is used to transition from time $T$ to $\tau$, and Heun's second-order solver is applied to solve the PF-ODE from time $\tau$ to 0.
  • Figure 2: Visualization results of tested methods for the sr4x-bicubic task. The details within the blue boxes are zoomed in for enhanced visual clarity. The NFE for I$^2$SB is 100, and for our method is 38.
  • Figure 3: Visualization results of tested methods for the JPEG-10 task. The details within the blue and yellow boxes are zoomed in for enhanced visual clarity. The NFE for I$^2$SB is 100, and for our method is 38.
  • Figure 4: Visualization results for the Edges$\rightarrow$Handbags (64$\times$64) task.
  • Figure 5: Visualization results for the DIODE-Outdoor (256$\times$256) task. The details within the blue and yellow boxes are zoomed in for enhanced visual clarity. The NFE for DDBM is 118, and for our method is 28.

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • Theorem 3