Table of Contents
Fetching ...

Reflected Diffusion Models

Aaron Lou, Stefano Ermon

TL;DR

Reflected Diffusion Models address the core limitation of standard diffusion models—sampling drift and artifacts caused by thresholding—by formulating diffusion on bounded domains Ω using reflected SDEs. The authors develop constrained denoising score matching (CDSM) to learn the perturbed density scores on Ω, and show that fundamental diffusion tools such as diffusion guidance, likelihood bounds, and probability-flow ODE sampling carry over to the reflected setting. They demonstrate competitive image benchmarks (e.g., CIFAR-10) and scalable diffusion on high-dimensional simplices, while preserving principled data-domain constraints without architectural changes. The work clarifies the relationship between thresholding and reflected dynamics, enabling stable, fast, and faithful sampling under strong guidance and providing a pathway for principled likelihood estimation in constrained generative modeling.

Abstract

Score-based diffusion models learn to reverse a stochastic differential equation that maps data to noise. However, for complex tasks, numerical error can compound and result in highly unnatural samples. Previous work mitigates this drift with thresholding, which projects to the natural data domain (such as pixel space for images) after each diffusion step, but this leads to a mismatch between the training and generative processes. To incorporate data constraints in a principled manner, we present Reflected Diffusion Models, which instead reverse a reflected stochastic differential equation evolving on the support of the data. Our approach learns the perturbed score function through a generalized score matching loss and extends key components of standard diffusion models including diffusion guidance, likelihood-based training, and ODE sampling. We also bridge the theoretical gap with thresholding: such schemes are just discretizations of reflected SDEs. On standard image benchmarks, our method is competitive with or surpasses the state of the art without architectural modifications and, for classifier-free guidance, our approach enables fast exact sampling with ODEs and produces more faithful samples under high guidance weight.

Reflected Diffusion Models

TL;DR

Reflected Diffusion Models address the core limitation of standard diffusion models—sampling drift and artifacts caused by thresholding—by formulating diffusion on bounded domains Ω using reflected SDEs. The authors develop constrained denoising score matching (CDSM) to learn the perturbed density scores on Ω, and show that fundamental diffusion tools such as diffusion guidance, likelihood bounds, and probability-flow ODE sampling carry over to the reflected setting. They demonstrate competitive image benchmarks (e.g., CIFAR-10) and scalable diffusion on high-dimensional simplices, while preserving principled data-domain constraints without architectural changes. The work clarifies the relationship between thresholding and reflected dynamics, enabling stable, fast, and faithful sampling under strong guidance and providing a pathway for principled likelihood estimation in constrained generative modeling.

Abstract

Score-based diffusion models learn to reverse a stochastic differential equation that maps data to noise. However, for complex tasks, numerical error can compound and result in highly unnatural samples. Previous work mitigates this drift with thresholding, which projects to the natural data domain (such as pixel space for images) after each diffusion step, but this leads to a mismatch between the training and generative processes. To incorporate data constraints in a principled manner, we present Reflected Diffusion Models, which instead reverse a reflected stochastic differential equation evolving on the support of the data. Our approach learns the perturbed score function through a generalized score matching loss and extends key components of standard diffusion models including diffusion guidance, likelihood-based training, and ODE sampling. We also bridge the theoretical gap with thresholding: such schemes are just discretizations of reflected SDEs. On standard image benchmarks, our method is competitive with or surpasses the state of the art without architectural modifications and, for classifier-free guidance, our approach enables fast exact sampling with ODEs and produces more faithful samples under high guidance weight.
Paper Structure (35 sections, 11 theorems, 70 equations, 19 figures, 2 tables)

This paper contains 35 sections, 11 theorems, 70 equations, 19 figures, 2 tables.

Key Result

Proposition 4.1

Suppose that we perturb an $\Omega$-supported density $a(\mathbf{x})$ with noise $b(\mathbf{x} | \cdot)$ (also supported on $\Omega$) to get a new density $b(\mathbf{x}) := \int_\Omega a(\mathbf{y}) b(\mathbf{x} | \mathbf{y}) \mathrm{d} \mathbf{y}$. Then, under suitable regularity conditions for the is equal (up to a constant factor that does not depend on $\mathbf{s}$) to the CSDM loss:

Figures (19)

  • Figure 1: Overview of Reflected Diffusion Models. We map a data distribution $p_0$ supported on $\Omega$ to the prior distribution $p_T$ through a reflected stochastic differential equation (Section \ref{['sec:method:rsde']}). Whenever a Brownian trajectory hits $\partial \Omega$, it is reflected back in instead of escaping (circled in red), so $p_t$ is supported on $\Omega$ for all $t$. We can recover $p_0$ from $p_T$ with a reversed reflected stochastic differential equation (Section \ref{['sec:method:reverse']}) by learning the Stein score $\nabla_x \log p_t$ (Section \ref{['sec:scorematching']}). Our generative model is guaranteed to be constrained in $\Omega$.
  • Figure 2: An overview of our computational method for constrained denoising score matching with Brownian transition probabilities. (i) We can draw samples by sampling $\mathcal{N}(\mathbf{x}_0, \sigma_t^2 I)$ and then applying reflections on the boundary. (ii) When $t$ is small, we compute the transition density by summing up a mixture of Gaussians (shown for $\Omega = [0, 1]$). (iii) When $t$ is large, we compute using the frequencies of $\Omega$ (shown for $\Omega = [0, 1]$). (iv) We diffeomorphically transform $\Omega \to [0, 1]^d$, where the transition score is tractable.
  • Figure 3: Without thresholding, standard diffusion models easily diverge. We sample using classifier-free guidance ($w=1)$ from a standard diffusion model without using thresholding. Around half of the samples diverge (generating blank images).
  • Figure 4: Non cherry-picked guided samples from a reflected and standard diffusion model with high guidance weight. We compare Reflected Diffusion Models with standard diffusion models for generating class-conditioned 64x64 ImageNet samples for a guidance weight $w=15$. Our generated images are shown on the left, and the baseline is shown on the right (same positions have same classes). Our method retains fidelity while the baseline suffers from oversaturation.
  • Figure 5: Guided ODE samples. We sample using our ODE with a guidance weight $w=1.5$, retaining image fidelity with fewer forward evaluations (around 100 compared with 1000).
  • ...and 14 more figures

Theorems & Definitions (22)

  • Remark 3.1
  • Proposition 4.1
  • Proposition 5.1: Thresholding solves a reflected SDE
  • Theorem 7.1: Reflected Girsanov for KL divergence
  • Proposition 1.1
  • proof
  • Proposition 1.2: Probability Flow ODE
  • proof
  • Proposition 1.3: Annealing Noise Level
  • proof
  • ...and 12 more