Table of Contents
Fetching ...

A Comprehensive Survey on Knowledge Distillation of Diffusion Models

Weijian Luo

TL;DR

Diffusion models deliver strong, flexible generative modeling but suffer from slow sampling. This survey organizes diffusion distillation into diffusion-to-field, diffusion-to-generator, and training-free acceleration, detailing how knowledge from large diffusion models can be transferred to faster, smaller surrogates and to efficient implicit generators. It covers both output- and path-based distillation, as well as deterministic and stochastic generator outcomes, and contrasts training-based and training-free acceleration strategies. The result is a comprehensive map of methods, trade-offs, and open questions for making diffusion models practical at scale.

Abstract

Diffusion Models (DMs), also referred to as score-based diffusion models, utilize neural networks to specify score functions. Unlike most other probabilistic models, DMs directly model the score functions, which makes them more flexible to parametrize and potentially highly expressive for probabilistic modeling. DMs can learn fine-grained knowledge, i.e., marginal score functions, of the underlying distribution. Therefore, a crucial research direction is to explore how to distill the knowledge of DMs and fully utilize their potential. Our objective is to provide a comprehensible overview of the modern approaches for distilling DMs, starting with an introduction to DMs and a discussion of the challenges involved in distilling them into neural vector fields. We also provide an overview of the existing works on distilling DMs into both stochastic and deterministic implicit generators. Finally, we review the accelerated diffusion sampling algorithms as a training-free method for distillation. Our tutorial is intended for individuals with a basic understanding of generative models who wish to apply DM's distillation or embark on a research project in this field.

A Comprehensive Survey on Knowledge Distillation of Diffusion Models

TL;DR

Diffusion models deliver strong, flexible generative modeling but suffer from slow sampling. This survey organizes diffusion distillation into diffusion-to-field, diffusion-to-generator, and training-free acceleration, detailing how knowledge from large diffusion models can be transferred to faster, smaller surrogates and to efficient implicit generators. It covers both output- and path-based distillation, as well as deterministic and stochastic generator outcomes, and contrasts training-based and training-free acceleration strategies. The result is a comprehensive map of methods, trade-offs, and open questions for making diffusion models practical at scale.

Abstract

Diffusion Models (DMs), also referred to as score-based diffusion models, utilize neural networks to specify score functions. Unlike most other probabilistic models, DMs directly model the score functions, which makes them more flexible to parametrize and potentially highly expressive for probabilistic modeling. DMs can learn fine-grained knowledge, i.e., marginal score functions, of the underlying distribution. Therefore, a crucial research direction is to explore how to distill the knowledge of DMs and fully utilize their potential. Our objective is to provide a comprehensible overview of the modern approaches for distilling DMs, starting with an introduction to DMs and a discussion of the challenges involved in distilling them into neural vector fields. We also provide an overview of the existing works on distilling DMs into both stochastic and deterministic implicit generators. Finally, we review the accelerated diffusion sampling algorithms as a training-free method for distillation. Our tutorial is intended for individuals with a basic understanding of generative models who wish to apply DM's distillation or embark on a research project in this field.
Paper Structure (21 sections, 48 equations, 6 figures)

This paper contains 21 sections, 48 equations, 6 figures.

Figures (6)

  • Figure 1: Forward and Reversed SDE of Diffusion Models. The figure is taken from Song2020ScoreBasedGM.
  • Figure 2: Knowledge Distillation Strategy proposed in Luhman2021KnowledgeDI.
  • Figure 3: Progressive Distillation Strategy proposed in Salimans2022ProgressiveDF.
  • Figure 4: Reflow Strategy for path distillation. The figure is taken from Liu2022FlowSA.
  • Figure 5: Reflow Strategy for path distillation. The figure is taken from Liu2022FlowSA.
  • ...and 1 more figures