Table of Contents
Fetching ...

Score-based Conditional Generation with Fewer Labeled Data by Self-calibrating Classifier Guidance

Paul Kuo-Ming Huang, Si-An Chen, Hsuan-Tien Lin

TL;DR

This paper tackles the challenge of conditional generation with score-based diffusion models when labeled data are scarce. It introduces a self-calibration (SC) technique that reinterprets a time-dependent classifier as an internal energy-based model and regularizes its conditional guidance through a DSM-based loss that aligns the classifier's internal unconditional score with the diffusion process. The approach enables improved class-conditioned generation by leveraging unlabeled data and maintaining stable training, outperforming vanilla CGSGMs and existing regularizers on CIFAR-10/100, especially in semi-supervised settings. The work also discusses practical considerations, including scalability versus high-resolution datasets, and highlights the potential for applying self-calibration to other conditional generative frameworks. Overall, SC provides a principled, efficient mechanism to calibrate classifier-guided SGMs under limited labeled data, with strong empirical gains in both fidelity and diversity of generated samples.

Abstract

Score-based generative models (SGMs) are a popular family of deep generative models that achieve leading image generation quality. Early studies extend SGMs to tackle class-conditional generation by coupling an unconditional SGM with the guidance of a trained classifier. Nevertheless, such classifier-guided SGMs do not always achieve accurate conditional generation, especially when trained with fewer labeled data. We argue that the problem is rooted in the classifier's tendency to overfit without coordinating with the underlying unconditional distribution. To make the classifier respect the unconditional distribution, we propose improving classifier-guided SGMs by letting the classifier regularize itself. The key idea of our proposed method is to use principles from energy-based models to convert the classifier into another view of the unconditional SGM. Existing losses for unconditional SGMs can then be leveraged to achieve regularization by calibrating the classifier's internal unconditional scores. The regularization scheme can be applied to not only the labeled data but also unlabeled ones to further improve the classifier. Across various percentages of fewer labeled data, empirical results show that the proposed approach significantly enhances conditional generation quality. The enhancements confirm the potential of the proposed self-calibration technique for generative modeling with limited labeled data.

Score-based Conditional Generation with Fewer Labeled Data by Self-calibrating Classifier Guidance

TL;DR

This paper tackles the challenge of conditional generation with score-based diffusion models when labeled data are scarce. It introduces a self-calibration (SC) technique that reinterprets a time-dependent classifier as an internal energy-based model and regularizes its conditional guidance through a DSM-based loss that aligns the classifier's internal unconditional score with the diffusion process. The approach enables improved class-conditioned generation by leveraging unlabeled data and maintaining stable training, outperforming vanilla CGSGMs and existing regularizers on CIFAR-10/100, especially in semi-supervised settings. The work also discusses practical considerations, including scalability versus high-resolution datasets, and highlights the potential for applying self-calibration to other conditional generative frameworks. Overall, SC provides a principled, efficient mechanism to calibrate classifier-guided SGMs under limited labeled data, with strong empirical gains in both fidelity and diversity of generated samples.

Abstract

Score-based generative models (SGMs) are a popular family of deep generative models that achieve leading image generation quality. Early studies extend SGMs to tackle class-conditional generation by coupling an unconditional SGM with the guidance of a trained classifier. Nevertheless, such classifier-guided SGMs do not always achieve accurate conditional generation, especially when trained with fewer labeled data. We argue that the problem is rooted in the classifier's tendency to overfit without coordinating with the underlying unconditional distribution. To make the classifier respect the unconditional distribution, we propose improving classifier-guided SGMs by letting the classifier regularize itself. The key idea of our proposed method is to use principles from energy-based models to convert the classifier into another view of the unconditional SGM. Existing losses for unconditional SGMs can then be leveraged to achieve regularization by calibrating the classifier's internal unconditional scores. The regularization scheme can be applied to not only the labeled data but also unlabeled ones to further improve the classifier. Across various percentages of fewer labeled data, empirical results show that the proposed approach significantly enhances conditional generation quality. The enhancements confirm the potential of the proposed self-calibration technique for generative modeling with limited labeled data.
Paper Structure (37 sections, 15 equations, 18 figures, 5 tables)

This paper contains 37 sections, 15 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: Illustration of proposed approach. A vanilla CGSGM takes the orange (DSM loss) and green (cross-entropy loss) arrows. The proposed CGSGM-SC additionally considers the two blue arrows representing the proposed self-calibration loss on both labeled and unlabeled data.
  • Figure 2: Calculation of the proposed self-calibration loss. First, perturbed sample $\tilde{x}$ is fed to the time-dependent classifier to obtain the output logits. Second, the logits are transformed into log-likelihood by applying $\log\sum_y\exp$. Third, we calculate its gradient w.r.t. the input $\tilde{x}$ to obtain the estimated score. Finally, denoising score matching is applied to obtain the proposed self-calibration loss. The proposed loss is used as an auxiliary loss to train the classifier.
  • Figure 3: Gradients of classifiers $\nabla_{\bm{x}} \log p(y|{\bm{x}})$ for toy dataset. The upper row contains the gradients for class 1 (red), and the lower contains the gradients for class 2 (blue). (a) Real data distribution. (b) Ground truth classifier gradients. Gradients estimated by (c) Vanilla CG, (d) CG-DLSM, (e) CG-JEM, and (f) CG with proposed self-calibration. We observed that the gradients estimated by the vanilla classifier are highly inaccurate and fluctuate greatly. On the other hand, regularized classifiers produce gradients that are closer to the ground truth and contain much fewer fluctuations.
  • Figure 4: Results of class-conditional generation in semi-supervised settings. In semi-supervised settings, although CFSGMs generate images with high intra-Density, which demonstrates they are able to generate images to the right class, the images only cover a small portion of the class-conditional distributions. This makes CGSGMs preferable as fewer labeled data does not cause the generation diversity of CGSGMs to decrease by much. Compared to other CGSGMs, the proposed self-calibration consistently achieves the best conditional generative performance.
  • Figure 5: Generated images from classifier-only score estimation
  • ...and 13 more figures