Table of Contents
Fetching ...

Enhancing Diffusion Model Guidance through Calibration and Regularization

Seyed Alireza Javid, Amirhossein Bagheri, Nuria González-Prelcic

TL;DR

Classifier-guided diffusion commonly suffers from overconfident predictions during early denoising, causing vanishing guidance gradients. The paper couples a differentiable Smooth ECE calibration loss with divergence-aware sampling strategies that operate on off-the-shelf classifiers, avoiding diffusion-model retraining. Through theoretical analysis of reverse KL, forward KL, and Jensen-Shannon divergences and extensive ImageNet-128x128 experiments, Jensen-Shannon divergence with a ResNet-101 classifier achieves a new low $FID$ of 2.13 while maintaining strong precision-recall balance. The results demonstrate that principled calibration and divergence-aware sampling provide practical, plug-and-play improvements for conditional image generation in deployed diffusion systems.

Abstract

Classifier-guided diffusion models have emerged as a powerful approach for conditional image generation, but they suffer from overconfident predictions during early denoising steps, causing the guidance gradient to vanish. This paper introduces two complementary contributions to address this issue. First, we propose a differentiable calibration objective based on the Smooth Expected Calibration Error (Smooth ECE), which improves classifier calibration with minimal fine-tuning and yields measurable improvements in Frechet Inception Distance (FID). Second, we develop enhanced sampling guidance methods that operate on off-the-shelf classifiers without requiring retraining. These include tilted sampling with batch-level reweighting, adaptive entropy-regularized sampling to preserve diversity, and a novel f-divergence-based sampling strategy that strengthens class-consistent guidance while maintaining mode coverage. Experiments on ImageNet 128x128 demonstrate that our divergence-regularized guidance achieves an FID of 2.13 using a ResNet-101 classifier, improving upon existing classifier-guided diffusion methods while requiring no diffusion model retraining. The results show that principled calibration and divergence-aware sampling provide practical and effective improvements for classifier-guided diffusion.

Enhancing Diffusion Model Guidance through Calibration and Regularization

TL;DR

Classifier-guided diffusion commonly suffers from overconfident predictions during early denoising, causing vanishing guidance gradients. The paper couples a differentiable Smooth ECE calibration loss with divergence-aware sampling strategies that operate on off-the-shelf classifiers, avoiding diffusion-model retraining. Through theoretical analysis of reverse KL, forward KL, and Jensen-Shannon divergences and extensive ImageNet-128x128 experiments, Jensen-Shannon divergence with a ResNet-101 classifier achieves a new low of 2.13 while maintaining strong precision-recall balance. The results demonstrate that principled calibration and divergence-aware sampling provide practical, plug-and-play improvements for conditional image generation in deployed diffusion systems.

Abstract

Classifier-guided diffusion models have emerged as a powerful approach for conditional image generation, but they suffer from overconfident predictions during early denoising steps, causing the guidance gradient to vanish. This paper introduces two complementary contributions to address this issue. First, we propose a differentiable calibration objective based on the Smooth Expected Calibration Error (Smooth ECE), which improves classifier calibration with minimal fine-tuning and yields measurable improvements in Frechet Inception Distance (FID). Second, we develop enhanced sampling guidance methods that operate on off-the-shelf classifiers without requiring retraining. These include tilted sampling with batch-level reweighting, adaptive entropy-regularized sampling to preserve diversity, and a novel f-divergence-based sampling strategy that strengthens class-consistent guidance while maintaining mode coverage. Experiments on ImageNet 128x128 demonstrate that our divergence-regularized guidance achieves an FID of 2.13 using a ResNet-101 classifier, improving upon existing classifier-guided diffusion methods while requiring no diffusion model retraining. The results show that principled calibration and divergence-aware sampling provide practical and effective improvements for classifier-guided diffusion.

Paper Structure

This paper contains 42 sections, 8 theorems, 73 equations, 3 figures, 8 tables, 2 algorithms.

Key Result

Proposition 1

The gradient of the entropy-regularized score can be written as

Figures (3)

  • Figure 1: The visualization of the denoising sampling process. Overconfidence in classifier-guided diffusion sampling without regularization is mostly effective in the last steps where the probability is close to one.
  • Figure 2: The visualization of intermediate sampling pictures and classifier gradient figures.
  • Figure 3: The visual comparison of intermediate sampling pictures and classifier gradient figures. The seed is fixed for direct comparison.

Theorems & Definitions (18)

  • Remark 1
  • Proposition 1
  • Proposition 2
  • Corollary 1: Reverse KL gradient
  • Lemma 1: Reverse KL decomposition
  • Proposition 3: Gaussian mixture analysis
  • Remark 2
  • Corollary 2: Forward KL gradient
  • Corollary 3: Jensen-Shannon gradient
  • Corollary 4: Squared Hellinger gradient
  • ...and 8 more