Table of Contents
Fetching ...

Convergence for score-based generative modeling with polynomial complexity

Holden Lee, Jianfeng Lu, Yixin Tan

TL;DR

This work provides the first polynomial-time convergence guarantees for score-based generative modeling when the score estimator is accurate in $L^2(p)$, avoiding exponential time and curse-of-dimensionality behaviors. It develops a general framework that converts $L^2$-level score accuracy into high-probability TV guarantees via a bad-set analysis, and applies the framework to Langevin dynamics and reverse SDEs, including annealed and predictor-corrector variants. A key contribution is showing that annealing and predictor-corrector strategies yield favorable convergence rates under mild smoothness and log-Sobolev assumptions, with bounds that scale polynomially in problem dimensions and constants. The results provide theoretical grounding for practical SGM procedures and motivate future work on multimodal distributions, weaker score-error regimes, and learning guarantees for the score function.

Abstract

Score-based generative modeling (SGM) is a highly successful approach for learning a probability distribution from data and generating further samples. We prove the first polynomial convergence guarantees for the core mechanic behind SGM: drawing samples from a probability density $p$ given a score estimate (an estimate of $\nabla \ln p$) that is accurate in $L^2(p)$. Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality. Our guarantee works for any smooth distribution and depends polynomially on its log-Sobolev constant. Using our guarantee, we give a theoretical analysis of score-based generative modeling, which transforms white-noise input into samples from a learned data distribution given score estimates at different noise scales. Our analysis gives theoretical grounding to the observation that an annealed procedure is required in practice to generate good samples, as our proof depends essentially on using annealing to obtain a warm start at each step. Moreover, we show that a predictor-corrector algorithm gives better convergence than using either portion alone.

Convergence for score-based generative modeling with polynomial complexity

TL;DR

This work provides the first polynomial-time convergence guarantees for score-based generative modeling when the score estimator is accurate in , avoiding exponential time and curse-of-dimensionality behaviors. It develops a general framework that converts -level score accuracy into high-probability TV guarantees via a bad-set analysis, and applies the framework to Langevin dynamics and reverse SDEs, including annealed and predictor-corrector variants. A key contribution is showing that annealing and predictor-corrector strategies yield favorable convergence rates under mild smoothness and log-Sobolev assumptions, with bounds that scale polynomially in problem dimensions and constants. The results provide theoretical grounding for practical SGM procedures and motivate future work on multimodal distributions, weaker score-error regimes, and learning guarantees for the score function.

Abstract

Score-based generative modeling (SGM) is a highly successful approach for learning a probability distribution from data and generating further samples. We prove the first polynomial convergence guarantees for the core mechanic behind SGM: drawing samples from a probability density given a score estimate (an estimate of ) that is accurate in . Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality. Our guarantee works for any smooth distribution and depends polynomially on its log-Sobolev constant. Using our guarantee, we give a theoretical analysis of score-based generative modeling, which transforms white-noise input into samples from a learned data distribution given score estimates at different noise scales. Our analysis gives theoretical grounding to the observation that an annealed procedure is required in practice to generate good samples, as our proof depends essentially on using annealing to obtain a warm start at each step. Moreover, we show that a predictor-corrector algorithm gives better convergence than using either portion alone.
Paper Structure (35 sections, 41 theorems, 197 equations, 2 algorithms)

This paper contains 35 sections, 41 theorems, 197 equations, 2 algorithms.

Key Result

Theorem 2.1

Let $p:\mathbb{R}^d\to \mathbb{R}$ be a probability density satisfying Assumption a:p(a:p-smooth, a:p-lsi) with $L \ge 1$ and $s:\mathbb{R}^d\to \mathbb{R}^d$ be a score estimate satisfying Assumption a:score(a:score-error). Consider the accuracy requirement in $\operatorname{TV}$ and $\chi^2$: $0<\ then running e:lmc-se with score estimate $s$, step size $h=\Theta\Bigl(\frac{\varepsilon_\chi^2}{d

Theorems & Definitions (72)

  • Theorem 2.1: LMC with $L^2$-accurate score estimate
  • Theorem 2.2: Annealed LMC with $L^2$-accurate score estimate
  • Theorem 3.1: Predictor with $L^2$-accurate score estimate, DDPM
  • Theorem 3.2: Predictor-corrector with $L^2$-accurate score estimate
  • Theorem 4.1
  • proof
  • Theorem 4.2: LMC under $L^\iy$ bound on gradient error
  • proof : Proof sketch of Theorem \ref{['t:corrector-tv-chi2']}
  • proof : Proof sketch of Theorem \ref{['t:ald']}
  • Theorem 4.3: Predictor steps under $L^\infty$ bound on score estimate, DDPM
  • ...and 62 more