Malliavin Calculus for Score-based Diffusion Models
Ehsan Mirafzali, Utkarsh Gupta, Patrick Wyrod, Frank Proske, Daniele Venturi, Razvan Marinescu
TL;DR
This work develops a rigorous Malliavin-calculus framework to obtain exact analytical expressions for the score function $\nabla_y \log p_t(y)$ of forward SDEs underlying score-based diffusion models. By leveraging the Malliavin matrix, first and second variation processes, and a Bismut-type formula, it yields closed-form score representations for linear SDEs that align with Fokker–Planck solutions, and extends to nonlinear SDEs with state-independent diffusion through a Skorokhod integral formulation. The authors provide algorithmic pipelines for both linear and nonlinear cases, including forward-variations simulation, neural estimation of conditional expectations, and reverse-time sampling guided by the Malliavin-derived score. Numerical results on synthetic datasets demonstrate competitive performance with state-of-the-art methods and offer insights into stability and discretisation in nonlinear settings, highlighting the framework's potential to generalise score-based diffusion models beyond Gaussian forward processes. This work grounds score-based modeling in a solid stochastic-analytic foundation, opening pathways to more expressive diffusion models and new estimation strategies for Malliavin derivatives in high dimensions.
Abstract
We introduce a new framework based on Malliavin calculus to derive exact analytical expressions for the score function $\nabla \log p_t(x)$, i.e., the gradient of the log-density associated with the solution to stochastic differential equations (SDEs). Our approach combines classical integration-by-parts techniques with modern stochastic analysis tools, such as Bismut's formula and Malliavin calculus, and it works for both linear and nonlinear SDEs. In doing so, we establish a rigorous connection between the Malliavin derivative, its adjoint, the Malliavin divergence (Skorokhod integral), and diffusion generative models, thereby providing a systematic method for computing $\nabla \log p_t(x)$. In the linear case, we present a detailed analysis showing that our formula coincides with the analytical score function derived from the solution of the Fokker--Planck equation. For nonlinear SDEs with state-independent diffusion coefficients, we derive a closed-form expression for $\nabla \log p_t(x)$. We evaluate the proposed framework across multiple generative tasks and find that its performance is comparable to state-of-the-art methods. These results can be generalised to broader classes of SDEs, paving the way for new score-based diffusion generative models.
