Table of Contents
Fetching ...

Neural Diffusion Models

Grigory Bartosh, Dmitry Vetrov, Christian A. Naesseth

TL;DR

This work addresses the rigidity of fixed forward processes in diffusion models by introducing Neural Diffusion Models (NDMs), which learn time-dependent nonlinear data transformations $F_\varphi(\mathbf{x}, t)$ to adapt the forward path. It provides a simulation-free variational objective and a continuous-time SDE/ODE formulation for the reverse process, enabling fast inference with standard solvers. Empirically, NDMs improve log-likelihood on CIFAR-10, downsampled ImageNet, and CelebA-HQ, while maintaining high-quality samples, and can learn simple dynamics such as dynamic optimal transport. The framework unifies and extends existing diffusion models, offering a density-estimation-friendly, flexible generative paradigm with broad applicability to compression, semi-supervised learning, and purification.

Abstract

Diffusion models have shown remarkable performance on many generative tasks. Despite recent success, most diffusion models are restricted in that they only allow linear transformation of the data distribution. In contrast, broader family of transformations can potentially help train generative distributions more efficiently, simplifying the reverse process and closing the gap between the true negative log-likelihood and the variational approximation. In this paper, we present Neural Diffusion Models (NDMs), a generalization of conventional diffusion models that enables defining and learning time-dependent non-linear transformations of data. We show how to optimise NDMs using a variational bound in a simulation-free setting. Moreover, we derive a time-continuous formulation of NDMs, which allows fast and reliable inference using off-the-shelf numerical ODE and SDE solvers. Finally, we demonstrate the utility of NDMs with learnable transformations through experiments on standard image generation benchmarks, including CIFAR-10, downsampled versions of ImageNet and CelebA-HQ. NDMs outperform conventional diffusion models in terms of likelihood and produce high-quality samples.

Neural Diffusion Models

TL;DR

This work addresses the rigidity of fixed forward processes in diffusion models by introducing Neural Diffusion Models (NDMs), which learn time-dependent nonlinear data transformations to adapt the forward path. It provides a simulation-free variational objective and a continuous-time SDE/ODE formulation for the reverse process, enabling fast inference with standard solvers. Empirically, NDMs improve log-likelihood on CIFAR-10, downsampled ImageNet, and CelebA-HQ, while maintaining high-quality samples, and can learn simple dynamics such as dynamic optimal transport. The framework unifies and extends existing diffusion models, offering a density-estimation-friendly, flexible generative paradigm with broad applicability to compression, semi-supervised learning, and purification.

Abstract

Diffusion models have shown remarkable performance on many generative tasks. Despite recent success, most diffusion models are restricted in that they only allow linear transformation of the data distribution. In contrast, broader family of transformations can potentially help train generative distributions more efficiently, simplifying the reverse process and closing the gap between the true negative log-likelihood and the variational approximation. In this paper, we present Neural Diffusion Models (NDMs), a generalization of conventional diffusion models that enables defining and learning time-dependent non-linear transformations of data. We show how to optimise NDMs using a variational bound in a simulation-free setting. Moreover, we derive a time-continuous formulation of NDMs, which allows fast and reliable inference using off-the-shelf numerical ODE and SDE solvers. Finally, we demonstrate the utility of NDMs with learnable transformations through experiments on standard image generation benchmarks, including CIFAR-10, downsampled versions of ImageNet and CelebA-HQ. NDMs outperform conventional diffusion models in terms of likelihood and produce high-quality samples.
Paper Structure (34 sections, 31 equations, 7 figures, 8 tables, 2 algorithms)

This paper contains 34 sections, 31 equations, 7 figures, 8 tables, 2 algorithms.

Figures (7)

  • Figure 1: The directed graphical models of DDIM and NDM.
  • Figure 2: Learned transforms for the 2D checkerboard distribution (left). Learned transforms for CIFAR-10 and MNIST (top right), as well as predictions for MNIST (bottom right). NDM learns useful forward transformations and more accurately predicts the data from injected noise.
  • Figure 3: Comparison of DDPM and NDM with restricted reverse process to be optimal transport, 1D distribution.
  • Figure 4: Comparison of DDPM and NDM on 2D distribution.
  • Figure 5: Samples $\mathbf{z}_t$ from forward process and predicted data points $\hat{\mathbf{x}}_{\theta}(\mathbf{z}_t, t)$ on MNIST. (a) Samples from DDPM. (b) Samples from NDM. In each group, Left: data sample, Top: noised samples $\mathbf{z}_t$, Bottom: predicted data points $\hat{\mathbf{x}}_{\theta}(\mathbf{z}_t, t)$.
  • ...and 2 more figures