Table of Contents
Fetching ...

SPI-GAN: Denoising Diffusion GANs with Straight-Path Interpolations

Jinsung Jeon, Noseong Park

TL;DR

This work presents an enhanced GAN-based denoising method, called SPI-GAN, using the proposed straight-path interpolation definition, and proposes a GAN architecture characterized by a continuous mapping neural network for imitating the denoising path.

Abstract

Score-based generative models (SGMs) show the state-of-the-art sampling quality and diversity. However, their training/sampling complexity is notoriously high due to the highly complicated forward/reverse processes, so they are not suitable for resource-limited settings. To solving this problem, learning a simpler process is gathering much attention currently. We present an enhanced GAN-based denoising method, called SPI-GAN, using our proposed straight-path interpolation definition. To this end, we propose a GAN architecture i) denoising through the straight-path and ii) characterized by a continuous mapping neural network for imitating the denoising path. This approach drastically reduces the sampling time while achieving as high sampling quality and diversity as SGMs. As a result, SPI-GAN is one of the best-balanced models among the sampling quality, diversity, and time for CIFAR-10, and CelebA-HQ-256.

SPI-GAN: Denoising Diffusion GANs with Straight-Path Interpolations

TL;DR

This work presents an enhanced GAN-based denoising method, called SPI-GAN, using the proposed straight-path interpolation definition, and proposes a GAN architecture characterized by a continuous mapping neural network for imitating the denoising path.

Abstract

Score-based generative models (SGMs) show the state-of-the-art sampling quality and diversity. However, their training/sampling complexity is notoriously high due to the highly complicated forward/reverse processes, so they are not suitable for resource-limited settings. To solving this problem, learning a simpler process is gathering much attention currently. We present an enhanced GAN-based denoising method, called SPI-GAN, using our proposed straight-path interpolation definition. To this end, we propose a GAN architecture i) denoising through the straight-path and ii) characterized by a continuous mapping neural network for imitating the denoising path. This approach drastically reduces the sampling time while achieving as high sampling quality and diversity as SGMs. As a result, SPI-GAN is one of the best-balanced models among the sampling quality, diversity, and time for CIFAR-10, and CelebA-HQ-256.
Paper Structure (16 sections, 3 equations, 5 figures, 3 tables)

This paper contains 16 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The comparison among four models: i) the original formulation of SGMs in (a), ii) DD-GAN's learning the shortcuts of the reverse SDE in (b), iii) Diffusion-GAN's augmentation method in (c) and iv) SPI-GAN's learning the straight path in (d). The red paths in (b), (c), and (d) are used as the discriminators' input.
  • Figure 2: The architecture of our proposed SPI-GAN. $\textbf{h}(u)$ is a latent vector which generates an interpolated image $\textbf{i}(u)$ at time $u$. Therefore, $\mathbf{i}(1)$ is an original image and $\mathbf{i}(0)$ is a noisy image. We perform this adversarial training every time $u$ but generate images with $u=1$. The constant noise $\mathbf{c}$ and the layer-wise varying noise $\mathbf{s}$ enable the stochasticity of the generator.
  • Figure 3: Process of generating sample from continuous latent vector.
  • Figure 4: Difference between reverse SDE and interpolation.
  • Figure 5: Qualitative results on CIFAR-10, and CelebA-HQ-255.