Score-based generative models are provably robust: an uncertainty quantification perspective

Nikiforos Mimikos-Stamatopoulos; Benjamin J. Zhang; Markos A. Katsoulakis

Score-based generative models are provably robust: an uncertainty quantification perspective

Nikiforos Mimikos-Stamatopoulos, Benjamin J. Zhang, Markos A. Katsoulakis

TL;DR

This work develops a PDE-regularity–driven uncertainty-quantification framework for score-based generative models (SGMs). By deriving the Wasserstein Uncertainty Propagation (WUP) theorem, it shows how $L^2$-norm errors in the learned score translate into integral probability metric deviations (e.g., $\mathbf{d}_1$ and TV) of the generated distribution under the diffusion flow. It provides explicit generalization bounds for both denoising score matching (DSM) and denoising score matching with early stopping (DSM-ESM), relying on minimal assumptions and without requiring manifold-density assumptions. The analysis leverages Kolmogorov backward equations and Bernstein-type gradient estimates for Hamilton-Jacobi-Bellman PDEs to obtain computable bounds, clarifying how error trade-offs, early stopping, and reference measures impact robustness. The results pave the way for principled uncertainty quantification, connections to likelihood-free inference, and distributionally robust optimization in SGMs.

Abstract

Through an uncertainty quantification (UQ) perspective, we show that score-based generative models (SGMs) are provably robust to the multiple sources of error in practical implementation. Our primary tool is the Wasserstein uncertainty propagation (WUP) theorem, a model-form UQ bound that describes how the $L^2$ error from learning the score function propagates to a Wasserstein-1 ($\mathbf{d}_1$) ball around the true data distribution under the evolution of the Fokker-Planck equation. We show how errors due to (a) finite sample approximation, (b) early stopping, (c) score-matching objective choice, (d) score function parametrization expressiveness, and (e) reference distribution choice, impact the quality of the generative model in terms of a $\mathbf{d}_1$ bound of computable quantities. The WUP theorem relies on Bernstein estimates for Hamilton-Jacobi-Bellman partial differential equations (PDE) and the regularizing properties of diffusion processes. Specifically, PDE regularity theory shows that stochasticity is the key mechanism ensuring SGM algorithms are provably robust. The WUP theorem applies to integral probability metrics beyond $\mathbf{d}_1$, such as the total variation distance and the maximum mean discrepancy. Sample complexity and generalization bounds in $\mathbf{d}_1$ follow directly from the WUP theorem. Our approach requires minimal assumptions, is agnostic to the manifold hypothesis and avoids absolute continuity assumptions for the target distribution. Additionally, our results clarify the trade-offs among multiple error sources in SGMs.

Score-based generative models are provably robust: an uncertainty quantification perspective

TL;DR

-norm errors in the learned score translate into integral probability metric deviations (e.g.,

and TV) of the generated distribution under the diffusion flow. It provides explicit generalization bounds for both denoising score matching (DSM) and denoising score matching with early stopping (DSM-ESM), relying on minimal assumptions and without requiring manifold-density assumptions. The analysis leverages Kolmogorov backward equations and Bernstein-type gradient estimates for Hamilton-Jacobi-Bellman PDEs to obtain computable bounds, clarifying how error trade-offs, early stopping, and reference measures impact robustness. The results pave the way for principled uncertainty quantification, connections to likelihood-free inference, and distributionally robust optimization in SGMs.

Abstract

error from learning the score function propagates to a Wasserstein-1 (

) ball around the true data distribution under the evolution of the Fokker-Planck equation. We show how errors due to (a) finite sample approximation, (b) early stopping, (c) score-matching objective choice, (d) score function parametrization expressiveness, and (e) reference distribution choice, impact the quality of the generative model in terms of a

bound of computable quantities. The WUP theorem relies on Bernstein estimates for Hamilton-Jacobi-Bellman partial differential equations (PDE) and the regularizing properties of diffusion processes. Specifically, PDE regularity theory shows that stochasticity is the key mechanism ensuring SGM algorithms are provably robust. The WUP theorem applies to integral probability metrics beyond

, such as the total variation distance and the maximum mean discrepancy. Sample complexity and generalization bounds in

follow directly from the WUP theorem. Our approach requires minimal assumptions, is agnostic to the manifold hypothesis and avoids absolute continuity assumptions for the target distribution. Additionally, our results clarify the trade-offs among multiple error sources in SGMs.

Paper Structure (28 sections, 14 theorems, 171 equations)

This paper contains 28 sections, 14 theorems, 171 equations.

Introduction
Background and notation
An uncertainty quantification approach to generalization in SGMs
Source of errors in score-based generative modeling
Model-form uncertainty quantification
Wasserstein uncertainty propagation theorem
Robustness of errors under ESM
Robustness of errors under DSM
Regularity theory of Hamilton-Jacobi-Bellman PDEs enables uncertainty quantification in SGMs
Kolmogorov backward equation determines suitable test functions
IPM bounds depend on choice of terminal function space and gradient estimates
$L^1$ estimates.
Wasserstein-1 ($\mathbf{d}_1$) estimates.
Bernstein estimates from HJB theory provide gradient estimates
Proof sketches --- Score-based generative models are robust to errors
...and 13 more sections

Key Result

Theorem 3.1

Let $\Omega= R\mathbb{T}^d$ or $\mathbb{R}^d$. Let $b^1,b^2:[0,T]\times \Omega \rightarrow \mathbb{R}^d$ be given with $\|\nabla b^1\|_{\infty} < \infty$, and $m_1,m_2 \in \mathcal{P}(\Omega)$. If $m^i$ for $i=1,2$ are given by then, up to a universal constant $C>0$, we have the following:

Theorems & Definitions (33)

Remark 2.1: Choice of domain $\Omega$
Theorem 3.1: Wasserstein Uncertainty Propagation
Theorem 3.2: ESM bounds
Theorem 3.3
Remark 3.4: Density lower bound
Remark 3.5: Trade-offs and memorization
Theorem 6.1
Remark 6.2
Lemma 6.3
proof
...and 23 more

Score-based generative models are provably robust: an uncertainty quantification perspective

TL;DR

Abstract

Score-based generative models are provably robust: an uncertainty quantification perspective

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (33)