Generative Diffusion Modeling: A Practical Handbook
Zihan Ding, Chi Jin
TL;DR
This handbook provides a practical, notation-aligned synthesis of diffusion-model families, unifying diffusion probabilistic models, score-based denoising, consistency models, rectified flow, and TrigFlow under a common framework. It emphasizes bridging the paper-to-code gap through standardized formulations, training objectives, and inference procedures, while outlining post-training techniques such as distillation and reward-based fine-tuning. The work clarifies relationships among methods via unified formulations, velocity mappings, and parameterization dualities, enabling robust implementations and fair comparisons. By focusing on pre-training, distillation, and task-specific fine-tuning, the handbook offers actionable guidance for building scalable, high-quality generative models across images, audio, video, and 3D content.
Abstract
This handbook offers a unified perspective on diffusion models, encompassing diffusion probabilistic models, score-based generative models, consistency models, rectified flow, and related methods. By standardizing notations and aligning them with code implementations, it aims to bridge the "paper-to-code" gap and facilitate robust implementations and fair comparisons. The content encompasses the fundamentals of diffusion models, the pre-training process, and various post-training methods. Post-training techniques include model distillation and reward-based fine-tuning. Designed as a practical guide, it emphasizes clarity and usability over theoretical depth, focusing on widely adopted approaches in generative modeling with diffusion models.
