Table of Contents
Fetching ...

Score-based generative models break the curse of dimensionality in learning a family of sub-Gaussian probability distributions

Frank Cole, Yulong Lu

TL;DR

This work provides a theoretical foundation for score-based diffusion models learning sub-Gaussian distributions by linking the log-relative density $f$ to Barron-space function classes. The authors prove that if $f$ can be locally approximated by a neural network with bounded path norm, then the score $\nabla_x \log p_t$ at any fixed time $t$ can be approximated without the curse of dimensionality in $L^2(p_t)$, and that empirical score matching yields TV guarantees for the target distribution. They derive explicit sample-size requirements and show that diffusion-based sampling from Gaussian mixtures can achieve dimension-free performance under these assumptions. The results extend to practical examples, including Barron-function targets and Gaussian mixtures, and they highlight a dimension-free approximation rate for the forward score as a central technical achievement. Collectively, the findings offer a rigorous explanation for the empirical success of SGMs in high dimensions and provide guidance for designing low-complexity target densities in diffusion-based generative modeling.

Abstract

While score-based generative models (SGMs) have achieved remarkable success in enormous image generation tasks, their mathematical foundations are still limited. In this paper, we analyze the approximation and generalization of SGMs in learning a family of sub-Gaussian probability distributions. We introduce a notion of complexity for probability distributions in terms of their relative density with respect to the standard Gaussian measure. We prove that if the log-relative density can be locally approximated by a neural network whose parameters can be suitably bounded, then the distribution generated by empirical score matching approximates the target distribution in total variation with a dimension-independent rate. We illustrate our theory through examples, which include certain mixtures of Gaussians. An essential ingredient of our proof is to derive a dimension-free deep neural network approximation rate for the true score function associated with the forward process, which is interesting in its own right.

Score-based generative models break the curse of dimensionality in learning a family of sub-Gaussian probability distributions

TL;DR

This work provides a theoretical foundation for score-based diffusion models learning sub-Gaussian distributions by linking the log-relative density to Barron-space function classes. The authors prove that if can be locally approximated by a neural network with bounded path norm, then the score at any fixed time can be approximated without the curse of dimensionality in , and that empirical score matching yields TV guarantees for the target distribution. They derive explicit sample-size requirements and show that diffusion-based sampling from Gaussian mixtures can achieve dimension-free performance under these assumptions. The results extend to practical examples, including Barron-function targets and Gaussian mixtures, and they highlight a dimension-free approximation rate for the forward score as a central technical achievement. Collectively, the findings offer a rigorous explanation for the empirical success of SGMs in high dimensions and provide guidance for designing low-complexity target densities in diffusion-based generative modeling.

Abstract

While score-based generative models (SGMs) have achieved remarkable success in enormous image generation tasks, their mathematical foundations are still limited. In this paper, we analyze the approximation and generalization of SGMs in learning a family of sub-Gaussian probability distributions. We introduce a notion of complexity for probability distributions in terms of their relative density with respect to the standard Gaussian measure. We prove that if the log-relative density can be locally approximated by a neural network whose parameters can be suitably bounded, then the distribution generated by empirical score matching approximates the target distribution in total variation with a dimension-independent rate. We illustrate our theory through examples, which include certain mixtures of Gaussians. An essential ingredient of our proof is to derive a dimension-free deep neural network approximation rate for the true score function associated with the forward process, which is interesting in its own right.
Paper Structure (25 sections, 26 theorems, 165 equations)

This paper contains 25 sections, 26 theorems, 165 equations.

Key Result

Proposition 1

Suppose assumptions datadistr2 ad approxassump hold. Then there exists a class of neural networks $\mathcal{NN}$ with low complexity such that

Theorems & Definitions (48)

  • Proposition 1: Approximation error for score function
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Proposition 5
  • Lemma 1
  • proof : Proof of Lemma \ref{['score']}
  • Proposition 6
  • proof
  • Lemma 2
  • ...and 38 more