Table of Contents
Fetching ...

Non-Normal Diffusion Models

Henry Li

TL;DR

The paper generalizes diffusion models by relaxing the conventional Gaussian assumption on diffusion increments $Δ\mathbf{x}_k$, proving an invariance principle that ensures convergence to a diffusion process as time steps shrink. This leads to a flexible framework where forward and backward updates can use arbitrary increment distributions (e.g., Gaussian, Laplace, Uniform) without sacrificing tractable training, via corresponding KL-based loss terms. It derives explicit loss forms for several combinations of increment and model distributions, showing both competitive likelihoods and qualitative differences in generated samples. The work broadens the methodological toolkit for score-based diffusion, enabling richer stylistic control and potential improvements in density estimation and image generation. It also highlights avenues for future theoretical analysis of score matching under alternative norms and the broader implications for diffusion-model design.

Abstract

Diffusion models generate samples by incrementally reversing a process that turns data into noise. We show that when the step size goes to zero, the reversed process is invariant to the distribution of these increments. This reveals a previously unconsidered parameter in the design of diffusion models: the distribution of the diffusion step $Δx_k := x_{k} - x_{k + 1}$. This parameter is implicitly set by default to be normally distributed in most diffusion models. By lifting this assumption, we generalize the framework for designing diffusion models and establish an expanded class of diffusion processes with greater flexibility in the choice of loss function used during training. We demonstrate the effectiveness of these models on density estimation and generative modeling tasks on standard image datasets, and show that different choices of the distribution of $Δx_k$ result in qualitatively different generated samples.

Non-Normal Diffusion Models

TL;DR

The paper generalizes diffusion models by relaxing the conventional Gaussian assumption on diffusion increments , proving an invariance principle that ensures convergence to a diffusion process as time steps shrink. This leads to a flexible framework where forward and backward updates can use arbitrary increment distributions (e.g., Gaussian, Laplace, Uniform) without sacrificing tractable training, via corresponding KL-based loss terms. It derives explicit loss forms for several combinations of increment and model distributions, showing both competitive likelihoods and qualitative differences in generated samples. The work broadens the methodological toolkit for score-based diffusion, enabling richer stylistic control and potential improvements in density estimation and image generation. It also highlights avenues for future theoretical analysis of score matching under alternative norms and the broader implications for diffusion-model design.

Abstract

Diffusion models generate samples by incrementally reversing a process that turns data into noise. We show that when the step size goes to zero, the reversed process is invariant to the distribution of these increments. This reveals a previously unconsidered parameter in the design of diffusion models: the distribution of the diffusion step . This parameter is implicitly set by default to be normally distributed in most diffusion models. By lifting this assumption, we generalize the framework for designing diffusion models and establish an expanded class of diffusion processes with greater flexibility in the choice of loss function used during training. We demonstrate the effectiveness of these models on density estimation and generative modeling tasks on standard image datasets, and show that different choices of the distribution of result in qualitatively different generated samples.

Paper Structure

This paper contains 26 sections, 16 theorems, 110 equations, 1 figure, 2 tables.

Key Result

Theorem 3.1

Let $\mathbf{x}_k$ be a structured random walk and $\mathbf{f}(\mathbf{x}, t_k) = \beta(t_k) \mathbf{x}$ be linear. Then where $\bar{\alpha}_k = \prod_{i=1}^k \left(1 + \beta_i \right)$ and $\bar{\gamma}_k = \sum_{i=1}^k \left(\frac{\bar{\alpha}_k}{\bar{\alpha}_{i+1}} g_i\right)^2$. For notational convenience, we let $\beta_i := \beta(t_i)\Delta_{t_k}$ and $g_i := g(t_i) \sqrt{\Delta_{t_k}}$.

Figures (1)

  • Figure 1: Images generated from the same seed via (in order from top to bottom) Gaussian-Gaussian, Laplace-Laplace, Uniform-Gaussian, and Uniform-Laplace diffusion increments. While the qualitative difference is somewhat subtle, Laplace diffusion appears to be biased towards smoother images with more saturated colors.

Theorems & Definitions (28)

  • Definition 1: Structured Random Walks
  • Theorem 3.1: Moments of Structured Random Walks
  • Corollary 3.1
  • Theorem 3.2: Structured Invariance Principle
  • Lemma 1.1
  • proof
  • Lemma 1.2
  • proof
  • Lemma 1.3
  • proof
  • ...and 18 more