Table of Contents
Fetching ...

Convergence of the Inexact Langevin Algorithm and Score-based Generative Models in KL Divergence

Kaylee Yingxi Yang, Andre Wibisono

TL;DR

The paper analyzes sampling with inexact score functions for ILD, ILA, and SGM under log-Sobolev inequalities, introducing a bounded MGF error as a middle-ground assumption between $L^{\infty}$ and $L^2$. It proves stable biased convergence in KL divergence for ILD/ILA and derives a stable KL guarantee for SGM when the score estimator satisfies the MGF condition, with a KDE-based estimator shown to meet this condition for sub-Gaussian targets. A key result is that the asymptotic bias is controlled by the score-error and the LSI constant, yielding time-uniform bounds that do not diverge with iteration length. The KDE analysis also provides a concrete path to implementable score estimators with theoretical guarantees, contributing to a more robust theoretical foundation for diffusion-based sampling methods in non-strongly-convex settings.

Abstract

We study the Inexact Langevin Dynamics (ILD), Inexact Langevin Algorithm (ILA), and Score-based Generative Modeling (SGM) when utilizing estimated score functions for sampling. Our focus lies in establishing stable biased convergence guarantees in terms of the Kullback-Leibler (KL) divergence. To achieve these guarantees, we impose two key assumptions: 1) the target distribution satisfies the log-Sobolev inequality (LSI), and 2) the score estimator exhibits a bounded Moment Generating Function (MGF) error. Notably, the MGF error assumption we adopt is more lenient compared to the $L^\infty$ error assumption used in existing literature. However, it is stronger than the $L^2$ error assumption utilized in recent works, which often leads to unstable bounds. We explore the question of how to obtain a provably accurate score estimator that satisfies the MGF error assumption. Specifically, we demonstrate that a simple estimator based on kernel density estimation fulfills the MGF error assumption for sub-Gaussian target distribution, at the population level.

Convergence of the Inexact Langevin Algorithm and Score-based Generative Models in KL Divergence

TL;DR

The paper analyzes sampling with inexact score functions for ILD, ILA, and SGM under log-Sobolev inequalities, introducing a bounded MGF error as a middle-ground assumption between and . It proves stable biased convergence in KL divergence for ILD/ILA and derives a stable KL guarantee for SGM when the score estimator satisfies the MGF condition, with a KDE-based estimator shown to meet this condition for sub-Gaussian targets. A key result is that the asymptotic bias is controlled by the score-error and the LSI constant, yielding time-uniform bounds that do not diverge with iteration length. The KDE analysis also provides a concrete path to implementable score estimators with theoretical guarantees, contributing to a more robust theoretical foundation for diffusion-based sampling methods in non-strongly-convex settings.

Abstract

We study the Inexact Langevin Dynamics (ILD), Inexact Langevin Algorithm (ILA), and Score-based Generative Modeling (SGM) when utilizing estimated score functions for sampling. Our focus lies in establishing stable biased convergence guarantees in terms of the Kullback-Leibler (KL) divergence. To achieve these guarantees, we impose two key assumptions: 1) the target distribution satisfies the log-Sobolev inequality (LSI), and 2) the score estimator exhibits a bounded Moment Generating Function (MGF) error. Notably, the MGF error assumption we adopt is more lenient compared to the error assumption used in existing literature. However, it is stronger than the error assumption utilized in recent works, which often leads to unstable bounds. We explore the question of how to obtain a provably accurate score estimator that satisfies the MGF error assumption. Specifically, we demonstrate that a simple estimator based on kernel density estimation fulfills the MGF error assumption for sub-Gaussian target distribution, at the population level.
Paper Structure (41 sections, 23 theorems, 140 equations, 2 tables)

This paper contains 41 sections, 23 theorems, 140 equations, 2 tables.

Key Result

Theorem 1

Assume $\nu$ is $\alpha$-LSI and score estimator $s$ has bounded MGF error (Assumption bdd-mgf-assump) with $r = \frac{1}{\alpha}$. Then for $X_t \sim \rho_t$ along the ILD Eq:ILD with score estimator $s$, we have

Theorems & Definitions (49)

  • Definition 1: KL divergence
  • Definition 2: Relative Fisher information
  • Definition 3: Rényi divergence
  • Theorem 1: Convergence of KL divergence for ILD
  • Theorem 2: Convergence of KL divergence for ILA
  • Theorem 3: Convergence of Rényi divergence for ILA under $L^\infty$ error
  • Example 1: Comparison of different score error assumptions in a simple Gaussian case
  • Theorem 4
  • Corollary 1
  • proof : Proof sketch of Theorem \ref{['thm:conv-ddpm']}
  • ...and 39 more