A Comprehensive Survey on Knowledge Distillation of Diffusion Models
Weijian Luo
TL;DR
Diffusion models deliver strong, flexible generative modeling but suffer from slow sampling. This survey organizes diffusion distillation into diffusion-to-field, diffusion-to-generator, and training-free acceleration, detailing how knowledge from large diffusion models can be transferred to faster, smaller surrogates and to efficient implicit generators. It covers both output- and path-based distillation, as well as deterministic and stochastic generator outcomes, and contrasts training-based and training-free acceleration strategies. The result is a comprehensive map of methods, trade-offs, and open questions for making diffusion models practical at scale.
Abstract
Diffusion Models (DMs), also referred to as score-based diffusion models, utilize neural networks to specify score functions. Unlike most other probabilistic models, DMs directly model the score functions, which makes them more flexible to parametrize and potentially highly expressive for probabilistic modeling. DMs can learn fine-grained knowledge, i.e., marginal score functions, of the underlying distribution. Therefore, a crucial research direction is to explore how to distill the knowledge of DMs and fully utilize their potential. Our objective is to provide a comprehensible overview of the modern approaches for distilling DMs, starting with an introduction to DMs and a discussion of the challenges involved in distilling them into neural vector fields. We also provide an overview of the existing works on distilling DMs into both stochastic and deterministic implicit generators. Finally, we review the accelerated diffusion sampling algorithms as a training-free method for distillation. Our tutorial is intended for individuals with a basic understanding of generative models who wish to apply DM's distillation or embark on a research project in this field.
