Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates

Connor Mooney; Zhongjian Wang; Jack Xin; Yifeng Yu

Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates

Connor Mooney, Zhongjian Wang, Jack Xin, Yifeng Yu

TL;DR

This work addresses global well-posedness and convergence of score-based generative diffusion models under minimal assumptions on the data and score estimation. It develops sharp Hessian and local Lipschitz bounds via a nonlinear Hamilton–Jacobi framework and a heat-equation transform, enabling global-in-time results for smooth data without requiring time-scale separation, and characterizing $O(1/t)$ gradient growth for data supported on compact manifolds in non-smooth settings. The authors establish well-posedness of the backward diffusion with a locally Lipschitz drift and provide KL-divergence convergence guarantees for a uniform exponential-integrator discretization in both near log-concave and general smooth regimes, with explicit dependence on data moments, Lipschitz constants, and discretization parameters. They further analyze manifold-supported data to reveal singular behavior near generation time and discuss practical implications for early stopping and score normalization. Collectively, the results offer rigorous guidance for designing and tuning score-based samplers in challenging data regimes, including non-log-concave and manifold-supported distributions.

Abstract

We establish global well-posedness and convergence of the score-based generative models (SGM) under minimal general assumptions of initial data for score estimation. For the smooth case, we start from a Lipschitz bound of the score function with optimal time length. The optimality is validated by an example whose Lipschitz constant of scores is bounded at initial but blows up in finite time. This necessitates the separation of time scales in conventional bounds for non-log-concave distributions. In contrast, our follow up analysis only relies on a local Lipschitz condition and is valid globally in time. This leads to the convergence of numerical scheme without time separation. For the non-smooth case, we show that the optimal Lipschitz bound is O(1/t) in the point-wise sense for distributions supported on a compact, smooth and low-dimensional manifold with boundary.

Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates

TL;DR

gradient growth for data supported on compact manifolds in non-smooth settings. The authors establish well-posedness of the backward diffusion with a locally Lipschitz drift and provide KL-divergence convergence guarantees for a uniform exponential-integrator discretization in both near log-concave and general smooth regimes, with explicit dependence on data moments, Lipschitz constants, and discretization parameters. They further analyze manifold-supported data to reveal singular behavior near generation time and discuss practical implications for early stopping and score normalization. Collectively, the results offer rigorous guidance for designing and tuning score-based samplers in challenging data regimes, including non-log-concave and manifold-supported distributions.

Abstract

Paper Structure (30 sections, 18 theorems, 192 equations, 1 figure)

This paper contains 30 sections, 18 theorems, 192 equations, 1 figure.

Introduction
Related work
Preliminaries
Background and Setting the Stage
Foundational Ideas based on Non-linear Hamilton Jacobi Equation
General notations
Sharp Hessian Bound of Score Potential Function
Hessian Estimate of Score Potential Function in Finite Time
Local Estimate
Compactly Supported Data Distributions
Well-posedness and Convergence under Sharp Lipschitz Bound
Case I: $p_0$ is (near) log-concave
Case II: General smooth $p_0$
Conclusion
Limitation
...and 15 more sections

Key Result

Theorem 3.1

Let $M_0$ be a nonnegative number. $g\in C^2({\mathbb R}^d)$The assumption is equivalent to $\log p_0\in C^2({\mathbb R}^d)$.. (1) If $D^2g(x)\preceq M_1I_n$, then (2) If $D^2g(x)\succeq -M_0I_n$, then for any $T\in \left[0, -\log(1-\frac{1}{M_0})\right)$, we have Note that if $M_0\leq 1$, then $T\in [0,\infty)$.

Figures (1)

Figure 1: In the above picture, $\Gamma_x=\{sy_1+(1-s)y_2: s\in [0,1]\}$

Theorems & Definitions (29)

Remark 2.2
Theorem 3.1
Corollary 3.2
Remark 3.3
Example 3.4: Loss of Uniform Hessian Bound
Remark 3.5
Theorem 3.6
Corollary 3.7
Remark 3.8
Theorem 3.9
...and 19 more

Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates

TL;DR

Abstract

Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (29)