Transport meets Variational Inference: Controlled Monte Carlo Diffusions
Francisco Vargas, Shreyas Padhy, Denis Blessing, Nikolas Nüsken
TL;DR
This work bridges optimal transport and variational inference by formulating sampling as a divergence on path space between forward and backward diffusion measures $D(\overrightarrow{{\mathbb{P}}}^{\mu,a}||\overleftarrow{{\mathbb{P}}}^{\nu,b})$, and introduces the Controlled Monte Carlo Diffusion (CMCD) sampler for Bayesian computation. CMCD fixes a target density path $\pi_t$ and learns a time-dependent control $\nabla\phi_t$ to realize the forward diffusion ${\mathrm d}{\bm Y}_t=(\sigma^2\nabla\ln\pi_t({\bm Y}_t)+\nabla\phi_t({\bm Y}_t))\,dt+\sigma\sqrt{2}\,\overrightarrow{d}{\bm W}_t$, effectively interpolating from $\pi_0$ to $\pi_T$ while enabling unbiased estimates of the normalising constant via a controlled Crooks identity. The authors connect EM and IPF through Schrödinger bridges, proving that the CMCD objective yields a unique optimal drift and linking the dynamic Schrödinger problem to entropy-regularised transport. Empirically, CMCD achieves state-of-the-art performance on sampling and normalising-constant estimation across multiple benchmarks, demonstrating the practical impact of unifying VI and OT with end-to-end diffusion training. This framework provides a principled, end-to-end approach to entropy-regularised transport and diffusion-based inference, with potential for adaptive annealing strategies and novel divergences.
Abstract
Connecting optimal transport and variational inference, we present a principled and systematic framework for sampling and generative modelling centred around divergences on path space. Our work culminates in the development of the \emph{Controlled Monte Carlo Diffusion} sampler (CMCD) for Bayesian computation, a score-based annealing technique that crucially adapts both forward and backward dynamics in a diffusion model. On the way, we clarify the relationship between the EM-algorithm and iterative proportional fitting (IPF) for Schr{ö}dinger bridges, deriving as well a regularised objective that bypasses the iterative bottleneck of standard IPF-updates. Finally, we show that CMCD has a strong foundation in the Jarzinsky and Crooks identities from statistical physics, and that it convincingly outperforms competing approaches across a wide array of experiments.
