Table of Contents
Fetching ...

Diffusion Models: A Mathematical Introduction

Sepehr Maleki, Negar Pourmoazemi

TL;DR

A concise, self-contained derivation of diffusion-based generative models, focusing on transparent algebra, explicit intermediate steps, and consistent notation, so that readers can both follow the theory and implement the corresponding algorithms in practice.

Abstract

We present a concise, self-contained derivation of diffusion-based generative models. Starting from basic properties of Gaussian distributions (densities, quadratic expectations, re-parameterisation, products, and KL divergences), we construct denoising diffusion probabilistic models from first principles. This includes the forward noising process, its closed-form marginals, the exact discrete reverse posterior, and the related variational bound. This bound simplifies to the standard noise-prediction goal used in practice. We then discuss likelihood estimation and accelerated sampling, covering DDIM, adversarially learned reverse dynamics (DDGAN), and multi-scale variants such as nested and latent diffusion, with Stable Diffusion as a canonical example. A continuous-time formulation follows, in which we derive the probability-flow ODE from the diffusion SDE via the continuity and Fokker-Planck equations, introduce flow matching, and show how rectified flows recover DDIM up to a time re-parameterisation. Finally, we treat guided diffusion, interpreting classifier guidance as a posterior score correction and classifier-free guidance as a principled interpolation between conditional and unconditional scores. Throughout, the focus is on transparent algebra, explicit intermediate steps, and consistent notation, so that readers can both follow the theory and implement the corresponding algorithms in practice.

Diffusion Models: A Mathematical Introduction

TL;DR

A concise, self-contained derivation of diffusion-based generative models, focusing on transparent algebra, explicit intermediate steps, and consistent notation, so that readers can both follow the theory and implement the corresponding algorithms in practice.

Abstract

We present a concise, self-contained derivation of diffusion-based generative models. Starting from basic properties of Gaussian distributions (densities, quadratic expectations, re-parameterisation, products, and KL divergences), we construct denoising diffusion probabilistic models from first principles. This includes the forward noising process, its closed-form marginals, the exact discrete reverse posterior, and the related variational bound. This bound simplifies to the standard noise-prediction goal used in practice. We then discuss likelihood estimation and accelerated sampling, covering DDIM, adversarially learned reverse dynamics (DDGAN), and multi-scale variants such as nested and latent diffusion, with Stable Diffusion as a canonical example. A continuous-time formulation follows, in which we derive the probability-flow ODE from the diffusion SDE via the continuity and Fokker-Planck equations, introduce flow matching, and show how rectified flows recover DDIM up to a time re-parameterisation. Finally, we treat guided diffusion, interpreting classifier guidance as a posterior score correction and classifier-free guidance as a principled interpolation between conditional and unconditional scores. Throughout, the focus is on transparent algebra, explicit intermediate steps, and consistent notation, so that readers can both follow the theory and implement the corresponding algorithms in practice.

Paper Structure

This paper contains 31 sections, 32 theorems, 205 equations.

Key Result

Lemma 2.2

Let $\mathbf{A},\mathbf{B}\in\mathbb{R}^{d\times d}$ and $\mathbf{u},\mathbf{v}\in\mathbb{R}^d$. Then:

Theorems & Definitions (67)

  • Definition 2.1: Trace
  • Lemma 2.2: Trace identities
  • Lemma 2.3: Trace trick
  • proof
  • Proposition 2.4: Expectation of a centered quadratic form
  • proof
  • Definition 2.5: Affine re-parameterisation
  • Proposition 2.6: Distribution of an affine transform of a standard Gaussian
  • proof
  • Corollary 2.7: Sampling $\mathcal{N}(\boldsymbol{\mu}, \mathbf{\Sigma})$
  • ...and 57 more