Table of Contents
Fetching ...

A High-Quality Robust Diffusion Framework for Corrupted Dataset

Quan Dao, Binh Ta, Tung Pham, Anh Tran

TL;DR

The paper tackles the problem of robustness to corrupted training data in image synthesis by introducing Robust Diffusion Unbalanced OT (RDUOT), a diffusion-based framework that replaces the DDGAN GAN with an OT-based generative model and learns the conditional $q(x_0|x_t)$ to filter out outliers. It hinges on semi-dual Unbalanced Optimal Transport with a Lipschitz-friendly potential $\Psi$, and designs a generator $G_\theta$ and a potential $D_\phi$ to stabilize training while leveraging latent variables for diversity. Theoretical results show the diffusion process reduces the Wasserstein distance between clean and outlier distributions, supporting the approach, while experiments demonstrate strong robustness across corrupted datasets and superior performance on clean data, along with comprehensive ablations on $\Psi$, timesteps, and $\tau$. Practically, RDUOT delivers fast, high-fidelity, and robust image generation, outperforming DDGAN and other robustness methods on varied benchmarks and showing promising applicability to real-world corrupted data scenarios.

Abstract

Developing image-generative models, which are robust to outliers in the training process, has recently drawn attention from the research community. Due to the ease of integrating unbalanced optimal transport (UOT) into adversarial framework, existing works focus mainly on developing robust frameworks for generative adversarial model (GAN). Meanwhile, diffusion models have recently dominated GAN in various tasks and datasets. However, according to our knowledge, none of them are robust to corrupted datasets. Motivated by DDGAN, our work introduces the first robust-to-outlier diffusion. We suggest replacing the UOT-based generative model for GAN in DDGAN to learn the backward diffusion process. Additionally, we demonstrate that the Lipschitz property of divergence in our framework contributes to more stable training convergence. Remarkably, our method not only exhibits robustness to corrupted datasets but also achieves superior performance on clean datasets.

A High-Quality Robust Diffusion Framework for Corrupted Dataset

TL;DR

The paper tackles the problem of robustness to corrupted training data in image synthesis by introducing Robust Diffusion Unbalanced OT (RDUOT), a diffusion-based framework that replaces the DDGAN GAN with an OT-based generative model and learns the conditional to filter out outliers. It hinges on semi-dual Unbalanced Optimal Transport with a Lipschitz-friendly potential , and designs a generator and a potential to stabilize training while leveraging latent variables for diversity. Theoretical results show the diffusion process reduces the Wasserstein distance between clean and outlier distributions, supporting the approach, while experiments demonstrate strong robustness across corrupted datasets and superior performance on clean data, along with comprehensive ablations on , timesteps, and . Practically, RDUOT delivers fast, high-fidelity, and robust image generation, outperforming DDGAN and other robustness methods on varied benchmarks and showing promising applicability to real-world corrupted data scenarios.

Abstract

Developing image-generative models, which are robust to outliers in the training process, has recently drawn attention from the research community. Due to the ease of integrating unbalanced optimal transport (UOT) into adversarial framework, existing works focus mainly on developing robust frameworks for generative adversarial model (GAN). Meanwhile, diffusion models have recently dominated GAN in various tasks and datasets. However, according to our knowledge, none of them are robust to corrupted datasets. Motivated by DDGAN, our work introduces the first robust-to-outlier diffusion. We suggest replacing the UOT-based generative model for GAN in DDGAN to learn the backward diffusion process. Additionally, we demonstrate that the Lipschitz property of divergence in our framework contributes to more stable training convergence. Remarkably, our method not only exhibits robustness to corrupted datasets but also achieves superior performance on clean datasets.
Paper Structure (27 sections, 3 theorems, 44 equations, 10 figures, 14 tables, 1 algorithm)

This paper contains 27 sections, 3 theorems, 44 equations, 10 figures, 14 tables, 1 algorithm.

Key Result

proposition thmcounterproposition

Denote $P^c$ and $P^o$ be clean and outlier probability measures. Let $P_t$ be the probability measure that $x_t \sim P_t$ is obtained from $x_0 \sim P$ by a forward diffusion. Wasserstein distance $W(P^c_t, P^o_t)$ decreases as $t$ increases.

Figures (10)

  • Figure 1: From left to right is corresponding to CE+FT, CE+CH, CE+MT and CE+FCE dataset. Top: DDGAN, Bottom: RDUOT. The red boxes indicate the synthesized outliers among the clean synthesized samples.
  • Figure 2: Qualitative results of RDUOT on 3 datasets STL-10, CIFAR-10, CelebA-HQ.
  • Figure 3: The graph of KL function and function whose convex conjugate is Softplus.
  • Figure 4: Outlier Robustness on Toy Dataset with $5\%$ outliers. The toy dataset is a mixture of two Gaussians $\mathcal{N}(1, 0.1)$ (clean dataset), $\mathcal{N}(-1, 0.05)$ (outlier dataset) with the mixture rate is ($0.95, 0.05$). In the first row, subplots compare target and generated densities between DDGAN and RDUOT. Left: DDGAN; Right: RDUOT. The second row showcases partial timestep RDUOT results. From left to right, semi-dual UOT loss is applied to the first 1, 2, 3 timesteps, and then to all timesteps.
  • Figure 5: Isolation forest + DDGAN
  • ...and 5 more figures

Theorems & Definitions (5)

  • proposition thmcounterproposition
  • proposition thmcounterproposition: Restated
  • lemma thmcounterlemma
  • proof
  • proof : \ref{['proposition_1_appendix']}