MGD: Moment Guided Diffusion for Maximum Entropy Generation

Etienne Lempereur; Nathanaël Cuvelle--Magar; Florentin Coeurdoux; Stéphane Mallat; Eric Vanden-Eijnden

MGD: Moment Guided Diffusion for Maximum Entropy Generation

Etienne Lempereur, Nathanaël Cuvelle--Magar, Florentin Coeurdoux, Stéphane Mallat, Eric Vanden-Eijnden

TL;DR

Moment Guided Diffusion (MGD) blends maximum entropy reasoning with diffusion-based sampling to generate high-dimensional distributions conditioned on moment constraints. It transports noise toward data along a finite-time, non-equilibrium path while enforcing moment preservation via a volatility-controlled SDE, and it converges to the maximum-entropy distribution $p_*$ as $\sigma$ grows, with a computable entropy bound that decays as $O(\sigma^{-2})$. The framework is validated on diverse multiscale processes—financial time series, turbulence, and cosmological fields—using wavelet scattering moments, enabling negentropy estimates and principled non-Gaussianity quantification. By enabling non-ergodic transport with explicit moment control and entropy estimation, MG D offers a scalable alternative to MCMC for high-dimensional max-entropy modelling and has broad implications for physics, finance, and beyond.

Abstract

Generating samples from limited information is a fundamental problem across scientific domains. Classical maximum entropy methods provide principled uncertainty quantification from moment constraints but require sampling via MCMC or Langevin dynamics, which typically exhibit exponential slowdown in high dimensions. In contrast, generative models based on diffusion and flow matching efficiently transport noise to data but offer limited theoretical guarantees and can overfit when data is scarce. We introduce Moment Guided Diffusion (MGD), which combines elements of both approaches. Building on the stochastic interpolant framework, MGD samples maximum entropy distributions by solving a stochastic differential equation that guides moments toward prescribed values in finite time, thereby avoiding slow mixing in equilibrium-based methods. We formally obtain, in the large-volatility limit, convergence of MGD to the maximum entropy distribution and derive a tractable estimator of the resulting entropy computed directly from the dynamics. Applications to financial time series, turbulent flows, and cosmological fields using wavelet scattering moments yield estimates of negentropy for high-dimensional multiscale processes.

MGD: Moment Guided Diffusion for Maximum Entropy Generation

TL;DR

grows, with a computable entropy bound that decays as

. The framework is validated on diverse multiscale processes—financial time series, turbulence, and cosmological fields—using wavelet scattering moments, enabling negentropy estimates and principled non-Gaussianity quantification. By enabling non-ergodic transport with explicit moment control and entropy estimation, MG D offers a scalable alternative to MCMC for high-dimensional max-entropy modelling and has broad implications for physics, finance, and beyond.

Abstract

Paper Structure (67 sections, 26 theorems, 183 equations, 12 figures, 2 tables, 3 algorithms)

This paper contains 67 sections, 26 theorems, 183 equations, 12 figures, 2 tables, 3 algorithms.

Introduction
Background: Classical Maximum Entropy and Modern Generative Modeling
Maximum Entropy Estimation via Langevin Dynamics
Flow Matching with Stochastic Interpolants
Moment Guided Diffusion
Moment Guided Diffusion
Discretization of MGD
Maximum Entropy: Convergence and Bounds
Convergence towards the Maximum Entropy Distribution
Entropy Estimation
Numerical Validation
Convergence towards Maximum Entropy Distributions
Non-log-concave Density
Slower Convergence
Non-smooth $\phi$
...and 52 more sections

Key Result

Theorem 3.1

Consider the SDE where $W_t$ is a Brownian noise and $\eta_t$ and $\theta_t$ solve where $G_t$ is the Gram matrix If this coupled system admits a solution, then the moment condition $\mathbb{E}[\phi(X_t)] = m_t$ holds for all $t \in [0,1]$.

Figures (12)

Figure 1: Illustration of trajectories ( in blue or red) of $X_t$ satisfying Equation (\ref{['eq:stochinterpolant_sde']}) for an interpolant $I_t$ defined with $\alpha_t = \pi t/2$ between white noise $Z$ and a bimodal unbalanced Gaussian mixture $X$, for $\sigma=1$. We display in gray in the background the density of $I_t$. When $t$ goes to $0$, the modes progressively disappear. At early times $t$, particles evolve freely in space, but they become trapped in the modes when the density $p_t$ becomes bimodal. Red particles are confined in the upper mode and blues in the lower one.
Figure 2: Convergence of MGD towards the maximum entropy bimodal distribution $p_*(x) = \mathcal{Z}^{-1} e^{-\tfrac{4}{5}(x^4 - 5x^2 - x/2)}$ for $X \sim p = p_*$. Left column: moment function $\phi(x) = (x, x^2, x^3, x^4)$. Right column: $\phi(x) = (x^2, \log p(x))$. (a,d) Log-density $\log p_*(x)$ (dashed) and $\log p_1^\sigma(x)$ for increasing $\sigma$ (blue to red). (b,e) Maximum entropy $H(p_*)$ (red line), sampled entropy $H(p_1^\sigma)$ (blue dots), and lower bound $H_*^\sigma$ from \ref{['eq:entropy']} (black dots) versus $\sigma^2$. (c,f) Entropy gaps $H(p_*) - H(p_1^\sigma)$ (blue) and $H(p_*) - H_*^\sigma$ (black) versus $\sigma^2$; the dashed line shows $\sigma^{-2}$ decay.
Figure 3: Convergence of MGD towards the Laplacian maximum entropy distribution $p_*(x) = \tfrac{1}{2}e^{-|x|}$ for $X \sim p = p_*$. (a) Log-density $\log p_*(x)$ (dashed) and $\log p_1^\sigma(x)$ for increasing $\sigma$ (blue to red). (b, top) Maximum entropy $H(p_*)$ (red line), sampled entropy $H(p_1^\sigma)$ (blue dots), and lower bound $H_*^\sigma$ from \ref{['eq:entropy']} (black dots) versus $\sigma^2$. (b, bottom) Entropy gaps $H(p_*) - H(p_1^\sigma)$ (blue) and $H(p_*) - H_*^\sigma$ (black) versus $\sigma^2$; the dashed line shows $\sigma^{-2}$ decay.
Figure 4: (a) Log-density $\log p_*(x)$ for $p_*(x) =\mathcal{Z}_\beta^{-1} e^{-\beta(x^4 - 5x^2 -x/2)}$ with increasing $\beta$ (blue to red). The two modes are separated by a barrier of height proportional to $\beta$. (b) Number of discretization steps $n_{\rm steps}$ required to reach a fixed Kullback--Leibler divergence from $p_*$, for MALA (red) and MGD (green), as a function of $\beta$. For MALA, $n_{\rm steps}$ grows exponentially with $\beta$; for MGD, it remains nearly constant.
Figure 5: (a): One-dimensional Morlet wavelet $\psi$. The wavelet is a complex function whose real and imaginary parts are respectively in blue and red. (b): real (left) and imaginary (right) parts of a two-dimensional Morlet wavelet.
...and 7 more figures

Theorems & Definitions (55)

Theorem 3.1: Moment Guided Diffusion
Remark 3.2: Sampling vs. modelling error
Remark 3.3
Conjecture 4.0: Max entropy
Remark 4.1
Proposition 4.1
Conjecture 4.1: Entropy bound
Theorem A.1: Moment Guided Diffusion
proof
Proposition A.0
...and 45 more

MGD: Moment Guided Diffusion for Maximum Entropy Generation

TL;DR

Abstract

MGD: Moment Guided Diffusion for Maximum Entropy Generation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (55)