A High-Quality Robust Diffusion Framework for Corrupted Dataset
Quan Dao, Binh Ta, Tung Pham, Anh Tran
TL;DR
The paper tackles the problem of robustness to corrupted training data in image synthesis by introducing Robust Diffusion Unbalanced OT (RDUOT), a diffusion-based framework that replaces the DDGAN GAN with an OT-based generative model and learns the conditional $q(x_0|x_t)$ to filter out outliers. It hinges on semi-dual Unbalanced Optimal Transport with a Lipschitz-friendly potential $\Psi$, and designs a generator $G_\theta$ and a potential $D_\phi$ to stabilize training while leveraging latent variables for diversity. Theoretical results show the diffusion process reduces the Wasserstein distance between clean and outlier distributions, supporting the approach, while experiments demonstrate strong robustness across corrupted datasets and superior performance on clean data, along with comprehensive ablations on $\Psi$, timesteps, and $\tau$. Practically, RDUOT delivers fast, high-fidelity, and robust image generation, outperforming DDGAN and other robustness methods on varied benchmarks and showing promising applicability to real-world corrupted data scenarios.
Abstract
Developing image-generative models, which are robust to outliers in the training process, has recently drawn attention from the research community. Due to the ease of integrating unbalanced optimal transport (UOT) into adversarial framework, existing works focus mainly on developing robust frameworks for generative adversarial model (GAN). Meanwhile, diffusion models have recently dominated GAN in various tasks and datasets. However, according to our knowledge, none of them are robust to corrupted datasets. Motivated by DDGAN, our work introduces the first robust-to-outlier diffusion. We suggest replacing the UOT-based generative model for GAN in DDGAN to learn the backward diffusion process. Additionally, we demonstrate that the Lipschitz property of divergence in our framework contributes to more stable training convergence. Remarkably, our method not only exhibits robustness to corrupted datasets but also achieves superior performance on clean datasets.
