Table of Contents
Fetching ...

Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration

Yifeng Yu, Lu Yu

TL;DR

This work analyzes Wasserstein-2 convergence for score-based diffusion models under multiple discretization schemes, revealing how EM, EI, REM, and REI affect convergence and where discretization and score estimation errors enter. It then introduces a Hessian-based second-order acceleration via local linearization, achieving a near-optimal $\widetilde{O}(1/\varepsilon)$ convergence in $W_2$ and $\mathcal{O}(1/\varepsilon)$ iterations by leveraging Hessian information about the log-density. Theoretical results are complemented by numerical studies on penalized logistic regression posteriors, demonstrating that the Hessian-informed method consistently outperforms first-order schemes. The work broadens understanding of Wasserstein convergence in SGMs and provides practical guidance for choosing discretization strategies and leveraging second-order information to accelerate diffusion-based samplers. It also sets the stage for future work on relaxing strong log-concavity and extending to more general forward processes and deterministic ODE-based samplers.

Abstract

Score-based diffusion models have emerged as powerful tools in generative modeling, yet their theoretical foundations remain underexplored. In this work, we focus on the Wasserstein convergence analysis of score-based diffusion models. Specifically, we investigate the impact of various discretization schemes, including Euler discretization, exponential integrators, and midpoint randomization methods. Our analysis provides a quantitative comparison of these discrete approximations, emphasizing their influence on convergence behavior. Furthermore, we explore scenarios where Hessian information is available and propose an accelerated sampler based on the local linearization method. We demonstrate that this Hessian-based approach achieves faster convergence rates of order $\widetilde{\mathcal{O}}\left(\frac{1}{\varepsilon}\right)$ significantly improving upon the standard rate $\widetilde{\mathcal{O}}\left(\frac{1}{\varepsilon^2}\right)$ of vanilla diffusion models, where $\varepsilon$ denotes the target accuracy.

Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration

TL;DR

This work analyzes Wasserstein-2 convergence for score-based diffusion models under multiple discretization schemes, revealing how EM, EI, REM, and REI affect convergence and where discretization and score estimation errors enter. It then introduces a Hessian-based second-order acceleration via local linearization, achieving a near-optimal convergence in and iterations by leveraging Hessian information about the log-density. Theoretical results are complemented by numerical studies on penalized logistic regression posteriors, demonstrating that the Hessian-informed method consistently outperforms first-order schemes. The work broadens understanding of Wasserstein convergence in SGMs and provides practical guidance for choosing discretization strategies and leveraging second-order information to accelerate diffusion-based samplers. It also sets the stage for future work on relaxing strong log-concavity and extending to more general forward processes and deterministic ODE-based samplers.

Abstract

Score-based diffusion models have emerged as powerful tools in generative modeling, yet their theoretical foundations remain underexplored. In this work, we focus on the Wasserstein convergence analysis of score-based diffusion models. Specifically, we investigate the impact of various discretization schemes, including Euler discretization, exponential integrators, and midpoint randomization methods. Our analysis provides a quantitative comparison of these discrete approximations, emphasizing their influence on convergence behavior. Furthermore, we explore scenarios where Hessian information is available and propose an accelerated sampler based on the local linearization method. We demonstrate that this Hessian-based approach achieves faster convergence rates of order significantly improving upon the standard rate of vanilla diffusion models, where denotes the target accuracy.

Paper Structure

This paper contains 27 sections, 21 theorems, 227 equations, 1 figure.

Key Result

Theorem 1

Suppose that Assumptions asm:p0scLipx, asm:scoreerr and asm:scLipt hold, it holds that where with $m_{\min}=\min(1,m_0)$ and $L_{\max}=1+L_0$.

Figures (1)

  • Figure 1: Error of various discretization schemes and second-order sampler with different choice of step size.

Theorems & Definitions (21)

  • Theorem 1
  • Corollary 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Corollary 7
  • Lemma 8: Lemma 12 in Gao2023WassersteinCG
  • Lemma 9: Proposition 10 in Gao2023WassersteinCG
  • Lemma 10
  • ...and 11 more