Table of Contents
Fetching ...

Consistency Diffusion Bridge Models

Guande He, Kaiwen Zheng, Jianfei Chen, Fan Bao, Jun Zhu

TL;DR

This work learns the consistency function of the probability-flow ordinary differential equation (PF-ODE) of DDBMs, which directly predicts the solution at a starting step given any point on the ODE trajectory, which is flexible to apply on DDBMs with broad design choices.

Abstract

Diffusion models (DMs) have become the dominant paradigm of generative modeling in a variety of domains by learning stochastic processes from noise to data. Recently, diffusion denoising bridge models (DDBMs), a new formulation of generative modeling that builds stochastic processes between fixed data endpoints based on a reference diffusion process, have achieved empirical success across tasks with coupled data distribution, such as image-to-image translation. However, DDBM's sampling process typically requires hundreds of network evaluations to achieve decent performance, which may impede their practical deployment due to high computational demands. In this work, inspired by the recent advance of consistency models in DMs, we tackle this problem by learning the consistency function of the probability-flow ordinary differential equation (PF-ODE) of DDBMs, which directly predicts the solution at a starting step given any point on the ODE trajectory. Based on a dedicated general-form ODE solver, we propose two paradigms: consistency bridge distillation and consistency bridge training, which is flexible to apply on DDBMs with broad design choices. Experimental results show that our proposed method could sample $4\times$ to $50\times$ faster than the base DDBM and produce better visual quality given the same step in various tasks with pixel resolution ranging from $64 \times 64$ to $256 \times 256$, as well as supporting downstream tasks such as semantic interpolation in the data space.

Consistency Diffusion Bridge Models

TL;DR

This work learns the consistency function of the probability-flow ordinary differential equation (PF-ODE) of DDBMs, which directly predicts the solution at a starting step given any point on the ODE trajectory, which is flexible to apply on DDBMs with broad design choices.

Abstract

Diffusion models (DMs) have become the dominant paradigm of generative modeling in a variety of domains by learning stochastic processes from noise to data. Recently, diffusion denoising bridge models (DDBMs), a new formulation of generative modeling that builds stochastic processes between fixed data endpoints based on a reference diffusion process, have achieved empirical success across tasks with coupled data distribution, such as image-to-image translation. However, DDBM's sampling process typically requires hundreds of network evaluations to achieve decent performance, which may impede their practical deployment due to high computational demands. In this work, inspired by the recent advance of consistency models in DMs, we tackle this problem by learning the consistency function of the probability-flow ordinary differential equation (PF-ODE) of DDBMs, which directly predicts the solution at a starting step given any point on the ODE trajectory. Based on a dedicated general-form ODE solver, we propose two paradigms: consistency bridge distillation and consistency bridge training, which is flexible to apply on DDBMs with broad design choices. Experimental results show that our proposed method could sample to faster than the base DDBM and produce better visual quality given the same step in various tasks with pixel resolution ranging from to , as well as supporting downstream tasks such as semantic interpolation in the data space.

Paper Structure

This paper contains 38 sections, 6 theorems, 66 equations, 10 figures, 4 tables.

Key Result

Proposition 2.1

Given an initial value $\bm{x}_t$ at time $t>0$, the first-order solver of the bridge ODE in Eqn. (eq:PF-ODE-DDBM) from $t$ to $r\in [0,t]$ with the noise schedule defined in Eqn. (eq:noise-schedule-bridge) is:

Figures (10)

  • Figure 1: NFE-FID plot of CDBM and DDBM on ImageNet $256 \times 256$
  • Figure 2: Ablation for hyperparameters of CDBM
  • Figure 3: Qualitative demonstration between DDBM and CDBM.
  • Figure 4: Example semantic interpolation result with CDBMs
  • Figure 5: Illustration of the effect of the parameter $b$ on the sigmoid-style training schedule.
  • ...and 5 more figures

Theorems & Definitions (11)

  • Example 2.1
  • Proposition 2.1
  • Proposition 2.2
  • Proposition 2.3
  • Proposition A.1
  • Example A.1
  • proof
  • Proposition A.1
  • proof
  • Proposition A.1
  • ...and 1 more