Table of Contents
Fetching ...

Label-Noise Robust Diffusion Models

Byeonghu Na, Yeongmin Kim, HeeSun Bae, Jung Hyun Lee, Se Jung Kwon, Wanmo Kang, Il-Chul Moon

TL;DR

This work addresses conditional diffusion models trained on noisy labels by deriving a linear relationship between clean-label and noisy-label conditional scores and introducing Transition-aware weighted Denoising Score Matching (TDSM). The core idea is to represent the noisy-score as a time- and instance-dependent convex combination of clean-score networks using weights $w(oldsymbol{x}_t, ilde{y},y,t)$ derived from the transition matrix $S$ and a time-dependent noisy-label classifier. The authors prove theoretical guarantees that, under invertible $S$, TDSM recovers the true clean-label conditional scores, and they provide practical estimation methods for the weights, including VolMinNet when $S$ is unknown. Empirical results on MNIST, CIFAR-10/100, and Clothing-1M show that TDSM improves conditional and unconditional generation metrics under various noise regimes and remains beneficial when combined with existing noisy-label correctors. The approach offers a scalable, diffusion-specific remedy to label noise with strong empirical robustness and practical training considerations.

Abstract

Conditional diffusion models have shown remarkable performance in various generative tasks, but training them requires large-scale datasets that often contain noise in conditional inputs, a.k.a. noisy labels. This noise leads to condition mismatch and quality degradation of generated data. This paper proposes Transition-aware weighted Denoising Score Matching (TDSM) for training conditional diffusion models with noisy labels, which is the first study in the line of diffusion models. The TDSM objective contains a weighted sum of score networks, incorporating instance-wise and time-dependent label transition probabilities. We introduce a transition-aware weight estimator, which leverages a time-dependent noisy-label classifier distinctively customized to the diffusion process. Through experiments across various datasets and noisy label settings, TDSM improves the quality of generated samples aligned with given conditions. Furthermore, our method improves generation performance even on prevalent benchmark datasets, which implies the potential noisy labels and their risk of generative model learning. Finally, we show the improved performance of TDSM on top of conventional noisy label corrections, which empirically proving its contribution as a part of label-noise robust generative models. Our code is available at: https://github.com/byeonghu-na/tdsm.

Label-Noise Robust Diffusion Models

TL;DR

This work addresses conditional diffusion models trained on noisy labels by deriving a linear relationship between clean-label and noisy-label conditional scores and introducing Transition-aware weighted Denoising Score Matching (TDSM). The core idea is to represent the noisy-score as a time- and instance-dependent convex combination of clean-score networks using weights derived from the transition matrix and a time-dependent noisy-label classifier. The authors prove theoretical guarantees that, under invertible , TDSM recovers the true clean-label conditional scores, and they provide practical estimation methods for the weights, including VolMinNet when is unknown. Empirical results on MNIST, CIFAR-10/100, and Clothing-1M show that TDSM improves conditional and unconditional generation metrics under various noise regimes and remains beneficial when combined with existing noisy-label correctors. The approach offers a scalable, diffusion-specific remedy to label noise with strong empirical robustness and practical training considerations.

Abstract

Conditional diffusion models have shown remarkable performance in various generative tasks, but training them requires large-scale datasets that often contain noise in conditional inputs, a.k.a. noisy labels. This noise leads to condition mismatch and quality degradation of generated data. This paper proposes Transition-aware weighted Denoising Score Matching (TDSM) for training conditional diffusion models with noisy labels, which is the first study in the line of diffusion models. The TDSM objective contains a weighted sum of score networks, incorporating instance-wise and time-dependent label transition probabilities. We introduce a transition-aware weight estimator, which leverages a time-dependent noisy-label classifier distinctively customized to the diffusion process. Through experiments across various datasets and noisy label settings, TDSM improves the quality of generated samples aligned with given conditions. Furthermore, our method improves generation performance even on prevalent benchmark datasets, which implies the potential noisy labels and their risk of generative model learning. Finally, we show the improved performance of TDSM on top of conventional noisy label corrections, which empirically proving its contribution as a part of label-noise robust generative models. Our code is available at: https://github.com/byeonghu-na/tdsm.
Paper Structure (60 sections, 8 theorems, 43 equations, 23 figures, 16 tables, 1 algorithm)

This paper contains 60 sections, 8 theorems, 43 equations, 23 figures, 16 tables, 1 algorithm.

Key Result

Theorem 1

Under a class-conditional label noise setting, for all ${\mathbf{x}}_t, \tilde{y}, t$, where $w({\mathbf{x}}_t, \tilde{y}, y, t) \coloneqq p(Y=y|\tilde{Y}=\tilde{y}) \frac{p_t({\mathbf{x}}_t|Y=y)}{p_t({\mathbf{x}}_t|\tilde{Y}=\tilde{y})}$.

Figures (23)

  • Figure 1: (a) Examples of noisy labeled datasets of MNIST (top) and CIFAR-10 (bottom), and (b-c) the randomly generated images of baseline and our models, trained with the noisy labeled datasets.
  • Figure 2: Contour maps of $w({\mathbf{x}}_t, \tilde{Y}=1, Y=1, t)$ in the 2-D Gaussian mixture model at different diffusion timesteps. The label transition probability is set to $p(Y|\tilde{Y}=1) = ( 0.8 , 0.2 )$. The dots represent samples from each clean label (orange for class 1, green for class 2), and the dashed lines represent contours with annotated values.
  • Figure 3: The training procedure of the proposed approach. The solid black arrows indicate the forward propagation, and the dashed red arrows represent the gradient signal flow. The filled circle operation denotes the dot product operation, and the dashed operation represents the L2 loss. The noisy-label classifier $\tilde{\mathbf{h}}_{\boldsymbol{\phi}^*}$ can be obtained by the cross-entropy loss on the noisy labeled dataset $\tilde{D}$.
  • Figure 4: Generated images from (a) baseline and (b) our models, trained on the CIFAR-10 datasets under 40% symmetric noise. Each row contains the samples generated by each class for a fixed ${\mathbf{x}}_T$.
  • Figure 5: Noisy labels of MNIST, captured by $\hat{w}$. Marks below images denote 'label → prediction'.
  • ...and 18 more figures

Theorems & Definitions (14)

  • Remark
  • Theorem 1
  • Proposition 1
  • Theorem 2
  • Proposition 2
  • Remark
  • Theorem 2
  • proof
  • Proposition 2
  • proof
  • ...and 4 more