Table of Contents
Fetching ...

On learning higher-order cumulants in diffusion models

Gert Aarts, Diaa E. Habibi, Lingxiao Wang, Kai Zhou

TL;DR

This work analyzes how higher-order cumulants evolve in diffusion models, addressing whether non-Gaussian correlations are preserved or learned. By deriving explicit moment- and cumulant-generating functionals for both forward and backward processes, it shows that in driftless diffusion higher cumulants $\kappa_{n>2}$ are conserved, while in drifted schemes they decay toward Gaussianity; nonetheless the backward score regenerates the target higher-order structure. The authors verify these predictions in an exactly solvable toy model and in a lattice $\phi^4$ theory, demonstrating accurate learning of non-Gaussian cumulants across many degrees of freedom. The results suggest diffusion models can faithfully capture complex correlations in physics-inspired data, with implications for generating lattice configurations and for refining diffusion-based methods through informed noise scheduling and RG-inspired perspectives.

Abstract

To analyse how diffusion models learn correlations beyond Gaussian ones, we study the behaviour of higher-order cumulants, or connected n-point functions, under both the forward and backward process. We derive explicit expressions for the moment- and cumulant-generating functionals, in terms of the distribution of the initial data and properties of forward process. It is shown analytically that during the forward process higher-order cumulants are conserved in models without a drift, such as the variance-expanding scheme, and that therefore the endpoint of the forward process maintains nontrivial correlations. We demonstrate that since these correlations are encoded in the score function, higher-order cumulants are learnt in the backward process, also when starting from a normal prior. We confirm our analytical results in an exactly solvable toy model with nonzero cumulants and in scalar lattice field theory.

On learning higher-order cumulants in diffusion models

TL;DR

This work analyzes how higher-order cumulants evolve in diffusion models, addressing whether non-Gaussian correlations are preserved or learned. By deriving explicit moment- and cumulant-generating functionals for both forward and backward processes, it shows that in driftless diffusion higher cumulants are conserved, while in drifted schemes they decay toward Gaussianity; nonetheless the backward score regenerates the target higher-order structure. The authors verify these predictions in an exactly solvable toy model and in a lattice theory, demonstrating accurate learning of non-Gaussian cumulants across many degrees of freedom. The results suggest diffusion models can faithfully capture complex correlations in physics-inspired data, with implications for generating lattice configurations and for refining diffusion-based methods through informed noise scheduling and RG-inspired perspectives.

Abstract

To analyse how diffusion models learn correlations beyond Gaussian ones, we study the behaviour of higher-order cumulants, or connected n-point functions, under both the forward and backward process. We derive explicit expressions for the moment- and cumulant-generating functionals, in terms of the distribution of the initial data and properties of forward process. It is shown analytically that during the forward process higher-order cumulants are conserved in models without a drift, such as the variance-expanding scheme, and that therefore the endpoint of the forward process maintains nontrivial correlations. We demonstrate that since these correlations are encoded in the score function, higher-order cumulants are learnt in the backward process, also when starting from a normal prior. We confirm our analytical results in an exactly solvable toy model with nonzero cumulants and in scalar lattice field theory.

Paper Structure

This paper contains 12 sections, 75 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Evolution of the normalised second moment or cumulant, presented as $\kappa_2/\kappa_2^{\rm exact}-1$, in the two-peak model in the variance-expanding scheme, with $\mu_0=1$ and $\sigma_0=1/4$, during the forward process (left), the backward process with the score determined by the diffusion model (middle), and with the analytical score (right), all using $10^6$ trajectories. The insets zoom in at $0.6<\tau<1$.
  • Figure 2: Evolution of the normalised $4^{\rm th}$ (left), $6^{\rm th}$ (middle) and $8^{\rm th}$ (right) moments, presented as $\mu_n/\mu_n^{\rm exact}-1$, in the two-peak model in the variance-expanding scheme, during the forward process using $10^5, 10^6$ and $10^7$ trajectories (above), and during the backward process with the score determined by the diffusion model, using $10^6$ trajectories (below). Other parameters as above.
  • Figure 3: Distribution created by sampling from the target distribution with $\mu_0=1$ and $\sigma_0=1/4$ (Data) and from the trained diffusion model in the variance-expanding scheme (Diffusion), using $10^6$ samples in each case.
  • Figure 4: Evolution of the normalised $4^{\rm th}, 6^{\rm th}$ and $8^{\rm th}$ cumulants, presented as $\kappa_n/\kappa_n^{\rm exact}-1$, in the two-peak model in the variance-expanding scheme, during the forward process, using $10^5, 10^6$ and $10^7$ trajectories (above), and during the backward process, with the score determined by the diffusion model, using $10^6$ trajectories (below). Other parameters as above.
  • Figure 5: As in the preceding figure, employing the analytical score during the backward process, using $10^6$ trajectories.
  • ...and 4 more figures