Score-based deterministic density sampling

Vasily Ilin; Peter Sushko; Jingwei Hu

Score-based deterministic density sampling

Vasily Ilin, Peter Sushko, Jingwei Hu

TL;DR

The paper tackles sampling from an unnormalized density $\pi$ when only the score $\nabla \log \pi$ is available, proposing Score-Based Transport Modeling (SBTM) as a deterministic counterpart to diffusion-based methods. SBTM couples a particle system with a time-varying score network $s^{\Theta_t}$ learned via score matching to approximate $\nabla \log f_t$, evolving particles with $\dot X_t=\nabla \log \pi(X_t)-s^{\Theta_t}(X_t)$ and updating $\Theta_t$ to minimize $L(s^{\Theta_t}, f_t)$. The authors prove entropy-dissipation guarantees for the coupled dynamics, show how small score-matching loss yields near-optimal convergence rates under a log-Sobolev condition, and extend results to annealed dynamics. Empirically, SBTM exhibits smooth trajectories, optimal or near-optimal convergence rates, and strong sample efficiency across low-dimensional, multimodal, and high-dimensional targets, including MNIST in $784$ dimensions, while scaling linearly with problem size. This approach provides a practical, deterministic alternative to Langevin dynamics with interpretable convergence diagnostics and effective high-dimensional performance.

Abstract

We propose a deterministic sampling framework using Score-Based Transport Modeling for sampling an unnormalized target density $π$ given only its score $\nabla \log π$. Our method approximates the Wasserstein gradient flow on $\mathrm{KL}(f_t\|π)$ by learning the time-varying score $\nabla \log f_t$ on the fly using score matching. While having the same marginal distribution as Langevin dynamics, our method produces smooth deterministic trajectories, resulting in monotone noise-free convergence. We prove that our method dissipates relative entropy at the same rate as the exact gradient flow, provided sufficient training. Numerical experiments validate our theoretical findings: our method converges at the optimal rate, has smooth trajectories, and is often more sample efficient than its stochastic counterpart. Experiments on high-dimensional image data show that our method produces high-quality generations in as few as 15 steps and exhibits natural exploratory behavior. The memory and runtime scale linearly in the sample size.

Score-based deterministic density sampling

TL;DR

The paper tackles sampling from an unnormalized density

when only the score

is available, proposing Score-Based Transport Modeling (SBTM) as a deterministic counterpart to diffusion-based methods. SBTM couples a particle system with a time-varying score network

learned via score matching to approximate

, evolving particles with

and updating

to minimize

. The authors prove entropy-dissipation guarantees for the coupled dynamics, show how small score-matching loss yields near-optimal convergence rates under a log-Sobolev condition, and extend results to annealed dynamics. Empirically, SBTM exhibits smooth trajectories, optimal or near-optimal convergence rates, and strong sample efficiency across low-dimensional, multimodal, and high-dimensional targets, including MNIST in

dimensions, while scaling linearly with problem size. This approach provides a practical, deterministic alternative to Langevin dynamics with interpretable convergence diagnostics and effective high-dimensional performance.

Abstract

We propose a deterministic sampling framework using Score-Based Transport Modeling for sampling an unnormalized target density

given only its score

. Our method approximates the Wasserstein gradient flow on

by learning the time-varying score

on the fly using score matching. While having the same marginal distribution as Langevin dynamics, our method produces smooth deterministic trajectories, resulting in monotone noise-free convergence. We prove that our method dissipates relative entropy at the same rate as the exact gradient flow, provided sufficient training. Numerical experiments validate our theoretical findings: our method converges at the optimal rate, has smooth trajectories, and is often more sample efficient than its stochastic counterpart. Experiments on high-dimensional image data show that our method produces high-quality generations in as few as 15 steps and exhibits natural exploratory behavior. The memory and runtime scale linearly in the sample size.

Score-based deterministic density sampling

TL;DR

Abstract

Score-based deterministic density sampling

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (14)