Training Class-Imbalanced Diffusion Model Via Overlap Optimization

Divin Yan; Lu Qi; Vincent Tao Hu; Ming-Hsuan Yang; Meng Tang

Training Class-Imbalanced Diffusion Model Via Overlap Optimization

Divin Yan, Lu Qi, Vincent Tao Hu, Ming-Hsuan Yang, Meng Tang

TL;DR

This work tackles the fidelity bias toward head classes in class-conditioned diffusion models trained on long-tailed data. It introduces DiffROP, a Probabilistic Contrastive Learning framework that penalizes distribution overlap between classes by minimizing $D_{KL}\left[p_{\theta}(x_{t-1}|x_t,\mathbf{c}^i)\|p_{\theta}(x_{t-1}|x_t,\mathbf{c}^j)\right]$, and integrates it with the standard diffusion objective along with time-dependent weighting and hinge-like margins. The approach is modular and compatible with classifier-free guidance, and it demonstrates substantial gains in FID, Recall, and Inception Score on CIFAR10LT and CIFAR100LT, particularly reducing tail-class overlap with head classes. The results also show improved utility for data augmentation in downstream long-tailed classification, highlighting practical impact for fairer, more data-efficient generative modeling. Overall, DiffROP provides a scalable, distribution-level contrastive mechanism to enhance diffusion models under long-tailed data regimes, with broad applicability beyond image synthesis.

Abstract

Diffusion models have made significant advances recently in high-quality image synthesis and related tasks. However, diffusion models trained on real-world datasets, which often follow long-tailed distributions, yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. To address the observed appearance overlap between synthesized images of rare classes and tail classes, we propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes. We show variants of our probabilistic contrastive learning method can be applied to any class conditional diffusion model. We show significant improvement in image synthesis using our loss for multiple datasets with long-tailed distribution. Extensive experimental results demonstrate that the proposed method can effectively handle imbalanced data for diffusion-based generation and classification models. Our code and datasets will be publicly available at https://github.com/yanliang3612/DiffROP.

Training Class-Imbalanced Diffusion Model Via Overlap Optimization

TL;DR

, and integrates it with the standard diffusion objective along with time-dependent weighting and hinge-like margins. The approach is modular and compatible with classifier-free guidance, and it demonstrates substantial gains in FID, Recall, and Inception Score on CIFAR10LT and CIFAR100LT, particularly reducing tail-class overlap with head classes. The results also show improved utility for data augmentation in downstream long-tailed classification, highlighting practical impact for fairer, more data-efficient generative modeling. Overall, DiffROP provides a scalable, distribution-level contrastive mechanism to enhance diffusion models under long-tailed data regimes, with broad applicability beyond image synthesis.

Abstract

Paper Structure (33 sections, 1 theorem, 18 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 33 sections, 1 theorem, 18 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Class-Imbalanced Representation Learning
Class-Imbalanced Generative Models
Supervised Contrastive Learning for Class-Imbalance Problems
Preliminaries
Unconditional Diffusion model
Conditional Diffusion Model
Class-imbalanced Diffusion model
Method
Probabilistic Contrastive Learning Loss
Overall Loss and Framework
Experiments
Loss visualization on toy data
Experimental Setup
...and 18 more sections

Key Result

Proposition 3.1

The original training objective in eq:kl_loss for DDPM can be rewritten as

Figures (8)

Figure 1: Motivation of our method.
Figure 2: Loss visualization w.r.t. the two estimated means of Guassians $m_1$ and $m_2$. With our Hinge-based PCL loss, solutions with large distribution overlap (when $m_1$ is close to $m_2$) will be penalized, while the global optima $(0,2)$ is preserved.
Figure 3: Qualitative results for tail classes. Our method is better at creating clear and realistic images for less common classes compared to other basic methods.
Figure 4: a). We conducted a thorough analysis of the impact that varying the time-dependent parameter, $\tau$, has on the CIFAR10LT and CIFAR100LT datasets. This analysis was performed through a detailed ablation study. 'TS' refers to the DiffROP model with time-dependent $\tau$. The term 'Vanilla' signifies the standard Vanilla DDPM. b). We demonstrate the impact of the strength of classifier-free guidance, denoted as $\omega$, on the efficacy of the DiffROP sampling process.
Figure 5: Qualitative results for tail classes in CIFAR10LT. Our method is better at creating clear and realistic images for less common classes compared to other basic methods.
...and 3 more figures

Theorems & Definitions (1)

Proposition 3.1: Weight-Biased Decomposition of DDPM Loss Function

Training Class-Imbalanced Diffusion Model Via Overlap Optimization

TL;DR

Abstract

Training Class-Imbalanced Diffusion Model Via Overlap Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (1)