Table of Contents
Fetching ...

Harmonic Path Integral Diffusion

Hamidreza Behjoo, Michael Chertkov

TL;DR

Harmonic Path Integral Diffusion (H-PID) provides an analytically tractable, neural-network-free framework for sampling from complex multivariate distributions by constructing a time-bridge from a delta at the origin to a target distribution over ${t\in[0,1]}$. The method yields an explicit optimal control ${\bm u}^{(*)}(t)= a(t){\bm x}(t)-b(t)\hat{\bm x}(t;{\bm x}(t))$ via a Hopf-Cole transformation that links to a Schrödinger-type equation with a harmonic potential, and reduces to Gaussian Green functions in the integrable cases $V=0$ and quadratic $V$. The authors develop two algorithmic streams—Universal Importance Sampling (UIS) that is energy-function agnostic and GT-sample-based strategies—that enable efficient, i.i.d. sampling and partition-function estimation, demonstrated on Gaussian mixtures and CIFAR-10. They further extend the theory to forced diffusion with gauge potentials and non-conservative forces, and analyze dynamic phase-transition-like behavior via the current weighted state as an order parameter. The work offers a transparent, mathematically grounded alternative to neural diffusion models with potential for extensions to broader physical-inspired stochastic control problems and hybrid NN/UI S frameworks.

Abstract

In this manuscript, we present a novel approach for sampling from a continuous multivariate probability distribution, which may either be explicitly known (up to a normalization factor) or represented via empirical samples. Our method constructs a time-dependent bridge from a delta function centered at the origin of the state space at $t=0$, optimally transforming it into the target distribution at $t=1$. We formulate this as a Stochastic Optimal Control problem of the Path Integral Control type, with a cost function comprising (in its basic form) a quadratic control term, a quadratic state term, and a terminal constraint. This framework, which we refer to as Harmonic Path Integral Diffusion (H-PID), leverages an analytical solution through a mapping to an auxiliary quantum harmonic oscillator in imaginary time. The H-PID framework results in a set of efficient sampling algorithms, without the incorporation of Neural Networks. The algorithms are validated on two standard use cases: a mixture of Gaussians over a grid and images from CIFAR-10. The transparency of the method allows us to analyze the algorithms in detail, particularly revealing that the current weighted state is an order parameter for the dynamic phase transition, signaling earlier, at $t<1$, that the sample generation process is almost complete. We contrast these algorithms with other sampling methods, particularly simulated annealing and path integral sampling, highlighting their advantages in terms of analytical control, accuracy, and computational efficiency on benchmark problems. Additionally, we extend the methodology to more general cases where the underlying stochastic differential equation includes an external deterministic, possibly non-conservative force, and where the cost function incorporates a gauge potential term.

Harmonic Path Integral Diffusion

TL;DR

Harmonic Path Integral Diffusion (H-PID) provides an analytically tractable, neural-network-free framework for sampling from complex multivariate distributions by constructing a time-bridge from a delta at the origin to a target distribution over . The method yields an explicit optimal control via a Hopf-Cole transformation that links to a Schrödinger-type equation with a harmonic potential, and reduces to Gaussian Green functions in the integrable cases and quadratic . The authors develop two algorithmic streams—Universal Importance Sampling (UIS) that is energy-function agnostic and GT-sample-based strategies—that enable efficient, i.i.d. sampling and partition-function estimation, demonstrated on Gaussian mixtures and CIFAR-10. They further extend the theory to forced diffusion with gauge potentials and non-conservative forces, and analyze dynamic phase-transition-like behavior via the current weighted state as an order parameter. The work offers a transparent, mathematically grounded alternative to neural diffusion models with potential for extensions to broader physical-inspired stochastic control problems and hybrid NN/UI S frameworks.

Abstract

In this manuscript, we present a novel approach for sampling from a continuous multivariate probability distribution, which may either be explicitly known (up to a normalization factor) or represented via empirical samples. Our method constructs a time-dependent bridge from a delta function centered at the origin of the state space at , optimally transforming it into the target distribution at . We formulate this as a Stochastic Optimal Control problem of the Path Integral Control type, with a cost function comprising (in its basic form) a quadratic control term, a quadratic state term, and a terminal constraint. This framework, which we refer to as Harmonic Path Integral Diffusion (H-PID), leverages an analytical solution through a mapping to an auxiliary quantum harmonic oscillator in imaginary time. The H-PID framework results in a set of efficient sampling algorithms, without the incorporation of Neural Networks. The algorithms are validated on two standard use cases: a mixture of Gaussians over a grid and images from CIFAR-10. The transparency of the method allows us to analyze the algorithms in detail, particularly revealing that the current weighted state is an order parameter for the dynamic phase transition, signaling earlier, at , that the sample generation process is almost complete. We contrast these algorithms with other sampling methods, particularly simulated annealing and path integral sampling, highlighting their advantages in terms of analytical control, accuracy, and computational efficiency on benchmark problems. Additionally, we extend the methodology to more general cases where the underlying stochastic differential equation includes an external deterministic, possibly non-conservative force, and where the cost function incorporates a gauge potential term.
Paper Structure (28 sections, 49 equations, 11 figures, 1 algorithm)

This paper contains 28 sections, 49 equations, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: The red circles represent i.i.d. samples drawn from the target distribution, which is a mixture of nine Gaussian components arranged in a $(3 \times 3)$ square grid, where distance between nearest grid points is $5$, variance of the Gaussians is 0.5. The blue circles illustrate the temporal evolution of samples governed by Eq. (\ref{['eq:SODE']}), where the time interval $t = (0, 1)$ is discretized into 200 steps and the optimal control ${\bm u}(t; {\bm x}) \to {\bm u}^*(t; {\bm x})$ is defined according to Eq. (\ref{['eq:UHIS']}) with $N=10000$. Each row corresponds to a different value of $\beta$ ($\beta = 0$, $\beta = 0.1$, $\beta = 1$, $\beta = 10$, and $\beta = 100$, respectively).
  • Figure 2: Estimation of the partition function, $Z$, as a function of (a) the number of time-discretization steps and (b) the number of samples, shown in Subfigures (a) and (b) respectively. The partition function $Z$ is calculated according to Algorithm (\ref{['alg:Z']}) in a setup similar to Figs. \ref{['fig:cl2']}, where the target distribution is a Gaussian mixture. The distance between adjacent centers in the $3 \times 3$ Gaussian mixture grid is 5, with $\beta = 0.5$. Both Subfigures present the mean and variance as boxplots, based on 10 independent experiments.
  • Figure 3: Trajectories of 10 samples generated according to the optimal stochastic dynamics for the target distribution corresponding to the main use case of the $3 \times 3$ grid. The samples start at the origin and evolve over time in the weighted state space (a-left) $\hat{\bm x}(t; {\bm x}(t))$ and the state space (b-right) $\bm{x}(t)$. Notice that exploration in the weighted state space (left) is more extensive, with a greater variety of states visited during the process. Here $\beta=0.1$ and number discretization steps is $200$.
  • Figure 4: Sample evolution from $0 \to 1$ : (left) the current weighted state, $\hat{\bm x}(t;{\bm x}(t))\vcentcolon= \sum_s\bm{y}^{(s)} w(\bm{y}^{(s)}|t;\bm{x}(t))$, and (right) the current state, $\bm{x}(t)$.
  • Figure 5: Auto correlations for a single sample: (a-left) $(\hat{\bm x}^T(t;{\bm x}(t)){\bm x}(1))$ and (b-right) $(\bm{x}^T(t){\bm x}(1))$.
  • ...and 6 more figures