Infinite-Dimensional Diffusion Models
Jakiw Pidstrigach, Youssef Marzouk, Sebastian Reich, Sven Wang
TL;DR
This work advances diffusion modeling by formulating and analyzing diffusion processes directly in infinite-dimensional Hilbert spaces, enabling principled function‑space generative modeling. It resolves key theoretical challenges—defining an infinite‑dimensional score via conditional expectations, ensuring well‑posed forward/reverse SDEs, and obtaining dimension‑free convergence guarantees—while providing practical guidelines for noise covariances and loss norms. The authors introduce two design paradigms (IDDM1 and IDDM2) and show that for image data the canonical White Noise Diffusion Model aligns with the theory, whereas other data distributions benefit from tailored choices. They validate the framework theoretically with existence/uniqueness and Wasserstein bounds and empirically across function‑space examples including manifolds and Bayesian inverse problems. The work thus offers a principled, scalable path for diffusion modeling directly in function spaces, with implications for inverse problems, simulations, and other infinite‑dimensional data domains.
Abstract
Diffusion models have had a profound impact on many application areas, including those where data are intrinsically infinite-dimensional, such as images or time series. The standard approach is first to discretize and then to apply diffusion models to the discretized data. While such approaches are practically appealing, the performance of the resulting algorithms typically deteriorates as discretization parameters are refined. In this paper, we instead directly formulate diffusion-based generative models in infinite dimensions and apply them to the generative modelling of functions. We prove that our formulations are well posed in the infinite-dimensional setting and provide dimension-independent distance bounds from the sample to the target measure. Using our theory, we also develop guidelines for the design of infinite-dimensional diffusion models. For image distributions, these guidelines are in line with current canonical choices. For other distributions, however, we can improve upon these canonical choices. We demonstrate these results both theoretically and empirically, by applying the algorithms to data distributions on manifolds and to distributions arising in Bayesian inverse problems or simulation-based inference.
