Neural Diffusion Processes
Vincent Dutordoir, Alan Saul, Zoubin Ghahramani, Fergus Simpson
TL;DR
<3-5 sentence high-level summary> Neural Diffusion Processes (NDPs) extend probabilistic diffusion to function spaces by diffusing over finite marginals and enforcing stochastic-process properties through a bi-dimensional attention block. This yields a flexible, non-Gaussian prior over functions that can emulate GP posteriors, marginalise hyperparameters, and perform conditional sampling with context data, while excelling in tasks like image regression and global optimisation. Empirically, NDPs match or surpass Neural Processes on various benchmarks and approach GP performance in regression and Bayesian optimisation, all while enabling novel joint modeling of inputs and outputs. The work highlights a practical, scalable approach to learning distributions over functions with strong theoretical properties and broad downstream applicability.
Abstract
Neural network approaches for meta-learning distributions over functions have desirable properties such as increased flexibility and a reduced complexity of inference. Building on the successes of denoising diffusion models for generative modelling, we propose Neural Diffusion Processes (NDPs), a novel approach that learns to sample from a rich distribution over functions through its finite marginals. By introducing a custom attention block we are able to incorporate properties of stochastic processes, such as exchangeability, directly into the NDP's architecture. We empirically show that NDPs can capture functional distributions close to the true Bayesian posterior, demonstrating that they can successfully emulate the behaviour of Gaussian processes and surpass the performance of neural processes. NDPs enable a variety of downstream tasks, including regression, implicit hyperparameter marginalisation, non-Gaussian posterior prediction and global optimisation.
