Table of Contents
Fetching ...

Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation

Jeongsol Kim, Beomsu Kim, Jong Chul Ye

TL;DR

This paper proposes Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion, and develops appropriate prompt optimization and change of variables formula to match the training and inference between distributions.

Abstract

Diffusion models (DMs), which enable both image generation from noise and inversion from data, have inspired powerful unpaired image-to-image (I2I) translation algorithms. However, they often require a larger number of neural function evaluations (NFEs), limiting their practical applicability. In this paper, we tackle this problem with Schrodinger Bridges (SBs), which are stochastic differential equations (SDEs) between distributions with minimal transport cost. We analyze the probability flow ordinary differential equation (ODE) formulation of SBs, and observe that we can decompose its vector field into a linear combination of source predictor, target predictor, and noise predictor. Inspired by this observation, we propose Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion, and develop appropriate prompt optimization and change of variables formula to match the training and inference between distributions. We demonstrate that our algorithm successfully conduct competitive I2I translation in unsupervised setting with only a fraction of computation cost required by previous DM-based I2I methods.

Latent Schrodinger Bridge: Prompting Latent Diffusion for Fast Unpaired Image-to-Image Translation

TL;DR

This paper proposes Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion, and develops appropriate prompt optimization and change of variables formula to match the training and inference between distributions.

Abstract

Diffusion models (DMs), which enable both image generation from noise and inversion from data, have inspired powerful unpaired image-to-image (I2I) translation algorithms. However, they often require a larger number of neural function evaluations (NFEs), limiting their practical applicability. In this paper, we tackle this problem with Schrodinger Bridges (SBs), which are stochastic differential equations (SDEs) between distributions with minimal transport cost. We analyze the probability flow ordinary differential equation (ODE) formulation of SBs, and observe that we can decompose its vector field into a linear combination of source predictor, target predictor, and noise predictor. Inspired by this observation, we propose Latent Schrodinger Bridges (LSBs) that approximate the SB ODE via pre-trained Stable Diffusion, and develop appropriate prompt optimization and change of variables formula to match the training and inference between distributions. We demonstrate that our algorithm successfully conduct competitive I2I translation in unsupervised setting with only a fraction of computation cost required by previous DM-based I2I methods.

Paper Structure

This paper contains 29 sections, 2 theorems, 38 equations, 15 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

The ODE with velocity defined with predictors in Eq. (eq:gamma_preds) translates samples from $\mathbb{P}_0$ to $\mathbb{P}_1$ when solved from $t = 0$ to $1$, and vice versa when solved from $t = 1$ to $0$.

Figures (15)

  • Figure 1: Decomposition of different ODEs for image to image translation. Dual Bridge requires inversion to Gaussian noise, causing large errors in fast translation; SDEdit lacks a repelling force from the source domain, resulting in incomplete translation. In contrast, the LSB ODE avoids inversion and includes a repelling term from the source image, enabling effective translation to the target image.
  • Figure 2: FID vs. NFE for three image translation tasks. We evaluate FID on translated images from baseline methods with various NFEs. For small NFE $\leq 10$, LSB ODE outperforms baselines and the quality is improved or maintained with more NFEs.
  • Figure 3: Qualitative comparison for Cat2Dog (8 NFEs)
  • Figure 4: Qualitative comparison for Horse2Zebra (8 NFEs)
  • Figure 5: Qualitative comparison for Dog2Wild (8 NFEs)
  • ...and 10 more figures

Theorems & Definitions (4)

  • Proposition 1
  • Proposition 2
  • proof
  • proof