Inference-Time Diffusion Model Distillation
Geon Yeong Park, Sang Wan Lee, Jong Chul Ye
TL;DR
Diffusion models suffer from slow sampling and a persistent gap between fast, distilled student models and their high-quality teacher counterparts. The authors propose Distillation++—an inference-time, tuning-free distillation framework that uses a score distillation sampling loss and teacher guidance during sampling to steer the student trajectory toward the teacher's clean manifold, without requiring extra data. The method generalizes to text-conditioned sampling and multiple solvers via a simple interpolation-based update, with guidance applied in early steps to achieve large gains in fidelity and semantic alignment at modest computational cost. Experiments on SDXL-based baselines show consistent improvements over state-of-the-art distillation methods, validating the practicality of tuning-free, inference-time distillation for diffusion models.
Abstract
Diffusion distillation models effectively accelerate reverse sampling by compressing the process into fewer steps. However, these models still exhibit a performance gap compared to their pre-trained diffusion model counterparts, exacerbated by distribution shifts and accumulated errors during multi-step sampling. To address this, we introduce Distillation++, a novel inference-time distillation framework that reduces this gap by incorporating teacher-guided refinement during sampling. Inspired by recent advances in conditional sampling, our approach recasts student model sampling as a proximal optimization problem with a score distillation sampling loss (SDS). To this end, we integrate distillation optimization during reverse sampling, which can be viewed as teacher guidance that drives student sampling trajectory towards the clean manifold using pre-trained diffusion models. Thus, Distillation++ improves the denoising process in real-time without additional source data or fine-tuning. Distillation++ demonstrates substantial improvements over state-of-the-art distillation baselines, particularly in early sampling stages, positioning itself as a robust guided sampling process crafted for diffusion distillation models. Code: https://github.com/geonyeong-park/inference_distillation.
