Training Class-Imbalanced Diffusion Model Via Overlap Optimization
Divin Yan, Lu Qi, Vincent Tao Hu, Ming-Hsuan Yang, Meng Tang
TL;DR
This work tackles the fidelity bias toward head classes in class-conditioned diffusion models trained on long-tailed data. It introduces DiffROP, a Probabilistic Contrastive Learning framework that penalizes distribution overlap between classes by minimizing $D_{KL}\left[p_{\theta}(x_{t-1}|x_t,\mathbf{c}^i)\|p_{\theta}(x_{t-1}|x_t,\mathbf{c}^j)\right]$, and integrates it with the standard diffusion objective along with time-dependent weighting and hinge-like margins. The approach is modular and compatible with classifier-free guidance, and it demonstrates substantial gains in FID, Recall, and Inception Score on CIFAR10LT and CIFAR100LT, particularly reducing tail-class overlap with head classes. The results also show improved utility for data augmentation in downstream long-tailed classification, highlighting practical impact for fairer, more data-efficient generative modeling. Overall, DiffROP provides a scalable, distribution-level contrastive mechanism to enhance diffusion models under long-tailed data regimes, with broad applicability beyond image synthesis.
Abstract
Diffusion models have made significant advances recently in high-quality image synthesis and related tasks. However, diffusion models trained on real-world datasets, which often follow long-tailed distributions, yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. To address the observed appearance overlap between synthesized images of rare classes and tail classes, we propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes. We show variants of our probabilistic contrastive learning method can be applied to any class conditional diffusion model. We show significant improvement in image synthesis using our loss for multiple datasets with long-tailed distribution. Extensive experimental results demonstrate that the proposed method can effectively handle imbalanced data for diffusion-based generation and classification models. Our code and datasets will be publicly available at https://github.com/yanliang3612/DiffROP.
