Table of Contents
Fetching ...

Closed-Form Diffusion Models

Christopher Scarvelis, Haitz Sáez de Ocáriz Borde, Justin Solomon

TL;DR

This work tackles the memorization and training-cost issues of score-based diffusion models by introducing smoothed closed-form diffusion models (σ-CFDMs) that generate novel samples without training. By explicitly smoothing the exact closed-form score, the method biases the sampling dynamics toward barycenters of nearby training points, enabling training-free generation with theoretical support on the sampling distribution’s support. An efficient NN-based estimator accelerates score evaluation, and the approach scales to high-dimensional data, achieving CPU-based image generation competitive with GPU-backed neural diffusion baselines, including notable results in pixel- and latent-space experiments. The findings highlight smoothing as a principled inductive bias for generalization and open avenues for conditioning and guidance without neural score learning.

Abstract

Score-based generative models (SGMs) sample from a target distribution by iteratively transforming noise using the score function of the perturbed target. For any finite training set, this score function can be evaluated in closed form, but the resulting SGM memorizes its training data and does not generate novel samples. In practice, one approximates the score by training a neural network via score-matching. The error in this approximation promotes generalization, but neural SGMs are costly to train and sample, and the effective regularization this error provides is not well-understood theoretically. In this work, we instead explicitly smooth the closed-form score to obtain an SGM that generates novel samples without training. We analyze our model and propose an efficient nearest-neighbor-based estimator of its score function. Using this estimator, our method achieves competitive sampling times while running on consumer-grade CPUs.

Closed-Form Diffusion Models

TL;DR

This work tackles the memorization and training-cost issues of score-based diffusion models by introducing smoothed closed-form diffusion models (σ-CFDMs) that generate novel samples without training. By explicitly smoothing the exact closed-form score, the method biases the sampling dynamics toward barycenters of nearby training points, enabling training-free generation with theoretical support on the sampling distribution’s support. An efficient NN-based estimator accelerates score evaluation, and the approach scales to high-dimensional data, achieving CPU-based image generation competitive with GPU-backed neural diffusion baselines, including notable results in pixel- and latent-space experiments. The findings highlight smoothing as a principled inductive bias for generalization and open avenues for conditioning and guidance without neural score learning.

Abstract

Score-based generative models (SGMs) sample from a target distribution by iteratively transforming noise using the score function of the perturbed target. For any finite training set, this score function can be evaluated in closed form, but the resulting SGM memorizes its training data and does not generate novel samples. In practice, one approximates the score by training a neural network via score-matching. The error in this approximation promotes generalization, but neural SGMs are costly to train and sample, and the effective regularization this error provides is not well-understood theoretically. In this work, we instead explicitly smooth the closed-form score to obtain an SGM that generates novel samples without training. We analyze our model and propose an efficient nearest-neighbor-based estimator of its score function. Using this estimator, our method achieves competitive sampling times while running on consumer-grade CPUs.
Paper Structure (38 sections, 7 theorems, 51 equations, 19 figures, 4 tables, 1 algorithm)

This paper contains 38 sections, 7 theorems, 51 equations, 19 figures, 4 tables, 1 algorithm.

Key Result

Proposition 4.1

The smoothed score $s_{\sigma,t}(z)$ can be expressed as:

Figures (19)

  • Figure 1: Effect of smoothing on the closed-form score (yellow streamplot). Colors represent distance weights in $k_t(z)$; blue regions of space are drawn to the blue point on the left, and vice-versa.
  • Figure 2: $W_2$ between $\sigma$-CFDM model samples and true samples. We depict model samples for $\sigma\in\{0,0.26,1\}$.
  • Figure 3: Sampling a $\sigma$-CFDM (blue points) yields a dense point cloud given sparse mesh samples (red points). We report % drop in $W_2$ distance to a dense mesh sampling when using our $\sigma$-CFDM's samples relative to the sparse training samples. We render these point clouds in Polyscope polyscope.
  • Figure 4: % change in $W_2$ between $\sigma$-CFDM model samples generated starting at $T=0$ and samples generated starting at $T>0$.
  • Figure 5: $W_2$ between $\sigma$-CFDM model samples generated using the full score and our NN-based estimator for varying # of NN $K$ (horizontal axis) and # of random samples $L$ (vertical axis).
  • ...and 14 more figures

Theorems & Definitions (7)

  • Proposition 4.1: $s_{\sigma,t}$ points towards barycenters of training points
  • Theorem 5.1: Support of $\sigma$-CFDM samples
  • Theorem 5.2: Approximation error from starting at $T>0$
  • Proposition 5.3
  • Proposition B.1
  • Proposition B.2
  • Proposition B.3