Table of Contents
Fetching ...

Regularized Distribution Matching Distillation for One-step Unpaired Image-to-Image Translation

Denis Rakitin, Ivan Shchekotov, Dmitry Vetrov

TL;DR

Regularized Distribution Matching Distillation is introduced, applicable to unpaired image-to-image (I2I) problems, and its empirical performance is demonstrated in application to several translation tasks, including 2D examples and I2I between different image datasets.

Abstract

Diffusion distillation methods aim to compress the diffusion models into efficient one-step generators while trying to preserve quality. Among them, Distribution Matching Distillation (DMD) offers a suitable framework for training general-form one-step generators, applicable beyond unconditional generation. In this work, we introduce its modification, called Regularized Distribution Matching Distillation, applicable to unpaired image-to-image (I2I) problems. We demonstrate its empirical performance in application to several translation tasks, including 2D examples and I2I between different image datasets, where it performs on par or better than multi-step diffusion baselines.

Regularized Distribution Matching Distillation for One-step Unpaired Image-to-Image Translation

TL;DR

Regularized Distribution Matching Distillation is introduced, applicable to unpaired image-to-image (I2I) problems, and its empirical performance is demonstrated in application to several translation tasks, including 2D examples and I2I between different image datasets.

Abstract

Diffusion distillation methods aim to compress the diffusion models into efficient one-step generators while trying to preserve quality. Among them, Distribution Matching Distillation (DMD) offers a suitable framework for training general-form one-step generators, applicable beyond unconditional generation. In this work, we introduce its modification, called Regularized Distribution Matching Distillation, applicable to unpaired image-to-image (I2I) problems. We demonstrate its empirical performance in application to several translation tasks, including 2D examples and I2I between different image datasets, where it performs on par or better than multi-step diffusion baselines.
Paper Structure (39 sections, 16 theorems, 74 equations, 6 figures)

This paper contains 39 sections, 16 theorems, 74 equations, 6 figures.

Key Result

Theorem 3.1

Let $c(\boldsymbol{x}, \boldsymbol{y})$ be the quadratic cost $\|\boldsymbol{x} - \boldsymbol{y}\|^2$ and $G^\lambda$ be the theoretical optimum in the problem eq:rdmd_loss. Then, under mild regularity conditions, it converges in probability (with respect to $p^{\mathcal{S}}$) to the optimal transpo

Figures (6)

  • Figure 1: Illustration of performance of the proposed RDMD model on $cat\rightarrow wild$ translation problem/ from the AFHQv2 choi2020stargan data set.
  • Figure 2: Comparison of the DMD loss surfaces without (left) and with (right) transport cost regularization on a toy problem of translating $\mathcal{N}(0, I)$ to $\mathcal{N}(0, 1.5^2 I)$. We set the regularization coefficient $\lambda = 0.2$. The generator is parameterized as $r \cdot C(\alpha)$, where $C(\alpha)$ is the rotation matrix, corresponding to the angle $\alpha$. Minima at the left contains all orthogonal matrices, multiplied by $\sigma = 1.5$, while the minimum at the right is attained in the only point, which is close, but not equal, to the OT map. The surfaces are moved up for the sake of visualization.
  • Figure 3: Visualization of RDMD mappings on $Gaussian \rightarrow 8 Gaussians$ with different choices of the regularization coefficient $\lambda$.
  • Figure 4: Comparison of RDMD with diffusion-based baselines. The figure demonstrates the tradeoff between generation quality (FID$\downarrow$) and the difference between the input and output (L2$\downarrow$, PSNR$\uparrow$, SSIM$\uparrow$). RDMD gives an overall better tradeoff given fairly strict requirements on the transport cost. In the cases of PSNR and SSIM, the $y$-axis is swapped for the sake of identical readability with the first plot (left is better, low is better).
  • Figure 5: Visual comparison of RDMD with diffusion-based baselines.
  • ...and 1 more figures

Theorems & Definitions (30)

  • Theorem 3.1
  • Theorem 1.1
  • Definition 1.2
  • Definition 1.3
  • Theorem 1.4: Portmanteau/Alexandrov
  • Definition 1.5
  • Definition 1.6
  • Theorem 1.7: Prokhorov
  • Corollary 1.8
  • Corollary 1.9
  • ...and 20 more