Table of Contents
Fetching ...

Wasserstein Convergence of Critically Damped Langevin Diffusions

Stanislas Strasman, Sobihan Surendran, Claire Boyer, Sylvain Le Corff, Vincent Lemaire, Antonio Ocello

TL;DR

This work provides the first Wasserstein-2 convergence analysis for Critically-Damped Langevin Diffusions (CLDs) in score-based diffusion settings, under weaker regularity than prior KL-based results. It introduces a generalized dynamic with a position-noise hyperparameter $\varepsilon$ that restores ellipticity on an extended phase space and enables a tractable Wasserstein contraction analysis. The authors establish an explicit bound in $\mathcal{W}_2$ that decomposes into an exponentially decaying term, a discretization error $\sqrt{h}$, and a score-approximation term, thereby connecting sampling accuracy to both discretization and model approximation. Empirically, tuning $\varepsilon$ improves sampling quality on challenging synthetic tasks, validating the theory and offering practical guidance for CLD-based SGMs.

Abstract

Score-based Generative Models (SGMs) have achieved impressive performance in data generation across a wide range of applications and benefit from strong theoretical guarantees. Recently, methods inspired by statistical mechanics, in particular, Hamiltonian dynamics, have introduced Critically-damped Langevin Diffusions (CLDs), which define diffusion processes on extended spaces by coupling the data with auxiliary variables. These approaches, along with their associated score-matching and sampling procedures, have been shown to outperform standard diffusion-based samplers numerically. In this paper, we analyze a generalized dynamic that extends classical CLDs by introducing an additional hyperparameter controlling the noise applied to the data coordinate, thereby better exploiting the extended space. We further derive a novel upper bound on the sampling error of CLD-based generative models in the Wasserstein metric. This additional hyperparameter influences the smoothness of sample paths, and our discretization error analysis provides practical guidance for its tuning, leading to improved sampling performance.

Wasserstein Convergence of Critically Damped Langevin Diffusions

TL;DR

This work provides the first Wasserstein-2 convergence analysis for Critically-Damped Langevin Diffusions (CLDs) in score-based diffusion settings, under weaker regularity than prior KL-based results. It introduces a generalized dynamic with a position-noise hyperparameter that restores ellipticity on an extended phase space and enables a tractable Wasserstein contraction analysis. The authors establish an explicit bound in that decomposes into an exponentially decaying term, a discretization error , and a score-approximation term, thereby connecting sampling accuracy to both discretization and model approximation. Empirically, tuning improves sampling quality on challenging synthetic tasks, validating the theory and offering practical guidance for CLD-based SGMs.

Abstract

Score-based Generative Models (SGMs) have achieved impressive performance in data generation across a wide range of applications and benefit from strong theoretical guarantees. Recently, methods inspired by statistical mechanics, in particular, Hamiltonian dynamics, have introduced Critically-damped Langevin Diffusions (CLDs), which define diffusion processes on extended spaces by coupling the data with auxiliary variables. These approaches, along with their associated score-matching and sampling procedures, have been shown to outperform standard diffusion-based samplers numerically. In this paper, we analyze a generalized dynamic that extends classical CLDs by introducing an additional hyperparameter controlling the noise applied to the data coordinate, thereby better exploiting the extended space. We further derive a novel upper bound on the sampling error of CLD-based generative models in the Wasserstein metric. This additional hyperparameter influences the smoothness of sample paths, and our discretization error analysis provides practical guidance for its tuning, leading to improved sampling performance.

Paper Structure

This paper contains 42 sections, 23 theorems, 288 equations, 2 figures, 3 tables, 2 algorithms.

Key Result

Theorem 3.1

Assume that Assumptions hyp:fisher_info- hyp:sup_approx hold. Then, there exist $c_1, c_2>0$ such that, for all $h>0$,

Figures (2)

  • Figure 1: Mean $\mathcal{W}_2$ distance over 5 repetitions between the test set and generated samples on Funnel distribution in dimension 100. Error bars represent $\pm$ one standard deviation.
  • Figure 3: Neural network architecture.

Theorems & Definitions (48)

  • Theorem 3.1
  • proof
  • Theorem 3.2
  • proof
  • Remark 3.3
  • Lemma A.1
  • proof
  • Lemma A.2
  • proof
  • Lemma A.3
  • ...and 38 more