Table of Contents
Fetching ...

Mathematical analysis of singularities in the diffusion model under the submanifold assumption

Yubin Lu, Zhongjian Wang, Guillaume Bal

TL;DR

The paper analyzes diffusion-based generative modeling under a submanifold assumption, proving that the conventional score target blows up as $t\to0$ for data on low-dimensional manifolds. It introduces a conditional-expectation-based target (CEM) with a bounded loss, reframing the backward drift as $S(X,t)=\frac{X}{1-e^{-t}}-\frac{e^{-t/2}}{1-e^{-t}}f(X_t,t)$ where $f(X_t,t)=E_{X_0|X_t}[X_0|X_t]$, and tunes the training with a time-weighting $\lambda(t)=(e^t-1)^{-1}$ and an exponential sampling schedule. Theoretical results show a pointwise singularity $S(X,t)=\frac{X-y_X}{t}(1+o(1))$ for $t\to0$, while CEM yields a bounded target and improved stability; experiments on 2D manifolds and MNIST corroborate reduced pollution and better performance, with ablations highlighting the importance of schedule and weighting. The approach offers a principled way to handle manifold-structured data in diffusion-based generative modeling, potentially improving robustness and sample quality in high-dimensional settings.

Abstract

This paper concerns the mathematical analyses of the diffusion model in machine learning. The drift term of the backward sampling process is represented as a conditional expectation involving the data distribution and the forward diffusion. The training process aims to find such a drift function by minimizing the mean-squared residue related to the conditional expectation. Using small-time approximations of the Green's function of the forward diffusion, we show that the analytical mean drift function in DDPM and the score function in SGM asymptotically blow up in the final stages of the sampling process for singular data distributions such as those concentrated on lower-dimensional manifolds, and are therefore difficult to approximate by a network. To overcome this difficulty, we derive a new target function and associated loss, which remains bounded even for singular data distributions. We validate the theoretical findings with several numerical examples.

Mathematical analysis of singularities in the diffusion model under the submanifold assumption

TL;DR

The paper analyzes diffusion-based generative modeling under a submanifold assumption, proving that the conventional score target blows up as for data on low-dimensional manifolds. It introduces a conditional-expectation-based target (CEM) with a bounded loss, reframing the backward drift as where , and tunes the training with a time-weighting and an exponential sampling schedule. Theoretical results show a pointwise singularity for , while CEM yields a bounded target and improved stability; experiments on 2D manifolds and MNIST corroborate reduced pollution and better performance, with ablations highlighting the importance of schedule and weighting. The approach offers a principled way to handle manifold-structured data in diffusion-based generative modeling, potentially improving robustness and sample quality in high-dimensional settings.

Abstract

This paper concerns the mathematical analyses of the diffusion model in machine learning. The drift term of the backward sampling process is represented as a conditional expectation involving the data distribution and the forward diffusion. The training process aims to find such a drift function by minimizing the mean-squared residue related to the conditional expectation. Using small-time approximations of the Green's function of the forward diffusion, we show that the analytical mean drift function in DDPM and the score function in SGM asymptotically blow up in the final stages of the sampling process for singular data distributions such as those concentrated on lower-dimensional manifolds, and are therefore difficult to approximate by a network. To overcome this difficulty, we derive a new target function and associated loss, which remains bounded even for singular data distributions. We validate the theoretical findings with several numerical examples.
Paper Structure (21 sections, 2 theorems, 79 equations, 11 figures, 1 table)

This paper contains 21 sections, 2 theorems, 79 equations, 11 figures, 1 table.

Key Result

Theorem 3.5

(Singularity of the score functions) Let $X\in\mathbb{R}^d\backslash\Omega$ and data distribution $p_{data}$ satisfy (H1) and (H2). Then, the score function $S(X,t)$ blows up as $t\to0$, and more precisely, satisfies

Figures (11)

  • Figure 1: 1d line normal distribution in 2d space. From left to right: CEM, SGM, DDPM, and the ground truth. The network configuration is as follows: 2 hidden layers, each layer with 16 nodes, and Tanh as the activation function.
  • Figure 2: Error at a fixed point $X_{eva}=(1,-0.1)$. Red, proposed CEM: $e(f,f_{\theta})$; Green, SGM: $e(S,S_{\theta})$; Blue, DDPM: $e(\epsilon,\epsilon_{\theta})$. (Left) first component of estimated function. (Right) second component of estimated function.
  • Figure 3: $L^2$-error with distribution $p$. Red, proposed CEM: $e_p(f,f_{\theta})$; Green, SGM: $e_p(S,S_{\theta})$; Blue, DDPM: $e_p(\epsilon,\epsilon_{\theta})$. (Left) the first component of the model function. (Right) the second component of the model function.
  • Figure 4: Curve distribution. From left to right: the proposed CEM, SGM, DDPM, and the ground truth. The network configuration is as follows: 3 hidden layers, each layer with 64 nodes, and Tanh as the activation function.
  • Figure 5: Generating five-point distribution in 2d space by the analytic expression of the drift, scattering plot of sampling process for $t=10,5.8718,3.2356,0.7518,0.0216,0$.
  • ...and 6 more figures

Theorems & Definitions (12)

  • Remark 3.1
  • Remark 3.2
  • Remark 3.3
  • Remark 3.4
  • Theorem 3.5
  • Remark 3.6
  • Theorem 3.7
  • Remark 3.8
  • Remark 3.9
  • Remark 3.10
  • ...and 2 more