Table of Contents
Fetching ...

Structured Diffusion Models with Mixture of Gaussians as Prior Distribution

Nanshan Jia, Tingyu Zhu, Haoyu Liu, Zeyu Zheng

TL;DR

A class of structured diffusion models, in which the prior distribution is chosen as a mixture of Gaussians, rather than a standard Gaussian distribution, is proposed, which is shown to be robust to mis-specifications.

Abstract

We propose a class of structured diffusion models, in which the prior distribution is chosen as a mixture of Gaussians, rather than a standard Gaussian distribution. The specific mixed Gaussian distribution, as prior, can be chosen to incorporate certain structured information of the data. We develop a simple-to-implement training procedure that smoothly accommodates the use of mixed Gaussian as prior. Theory is provided to quantify the benefits of our proposed models, compared to the classical diffusion models. Numerical experiments with synthetic, image and operational data are conducted to show comparative advantages of our model. Our method is shown to be robust to mis-specifications and in particular suits situations where training resources are limited or faster training in real time is desired.

Structured Diffusion Models with Mixture of Gaussians as Prior Distribution

TL;DR

A class of structured diffusion models, in which the prior distribution is chosen as a mixture of Gaussians, rather than a standard Gaussian distribution, is proposed, which is shown to be robust to mis-specifications.

Abstract

We propose a class of structured diffusion models, in which the prior distribution is chosen as a mixture of Gaussians, rather than a standard Gaussian distribution. The specific mixed Gaussian distribution, as prior, can be chosen to incorporate certain structured information of the data. We develop a simple-to-implement training procedure that smoothly accommodates the use of mixed Gaussian as prior. Theory is provided to quantify the benefits of our proposed models, compared to the classical diffusion models. Numerical experiments with synthetic, image and operational data are conducted to show comparative advantages of our model. Our method is shown to be robust to mis-specifications and in particular suits situations where training resources are limited or faster training in real time is desired.

Paper Structure

This paper contains 25 sections, 2 theorems, 36 equations, 17 figures, 8 tables, 4 algorithms.

Key Result

Proposition 1

Given the cluster number $K$ and the cluster centers ${\mathbf{c}}_1,\cdots,{\mathbf{c}}_K$, we define $X_i=\{{\mathbf{x}}:D({\mathbf{x}})=i\}$ and $p_i=\frac{\vert X_i\vert}{\sum_{j=1}^K\vert X_j\vert}$ for $i=1,2,\cdots,K$. Under the assumption that ${\mathbf{c}}_i$ is the arithmetic mean of $X_i$

Figures (17)

  • Figure 1: DDPM with varying training steps
  • Figure 2: DDPM and mixDDPM on 1D Bimodal Gaussian Mixture Model
  • Figure 3: DDPM and mixDDPM on Oakland Call Center Dataset
  • Figure 4: EMNIST Experiments with N=128
  • Figure 5: Experiments on CIFAR10 with 480k Training Steps
  • ...and 12 more figures

Theorems & Definitions (4)

  • Proposition 1
  • Proposition 2
  • proof
  • proof