Table of Contents
Fetching ...

Connective Viewpoints of Signal-to-Noise Diffusion Models

Khanh Doan, Long Tung Vuong, Tuan Nguyen, Anh Tuan Bui, Quyen Tran, Thanh-Toan Do, Dinh Phung, Trung Le

TL;DR

The paper addresses unifying the diverse viewpoints of Signal-to-Noise (S2N) diffusion models by connecting forward SDEs, backward SDEs, non-Markovian continuous variational diffusion, and the information-theoretic perspective in the signal-to-noise space. It develops a forward SDE for S2N diffusion consistent with prior formulations, derives a generalized backward SDE with exact and parametric inference formulas, and extends to a non-Markovian CV model with its SDE, while mapping to SN space to provide a broad information-theoretic view (MMSE and mutual information derivatives). The authors validate the framework through deterministic and stochastic sampling experiments, showing that carefully chosen hyperparameters like $\gamma$, $\rho$, and $\delta$ can improve FID on multiple benchmarks, though gains can be dataset-dependent. Overall, the connective viewpoints offer a principled, flexible lens on noise scheduling and inference in S2N diffusion models with practical implications for faster and higher-fidelity sampling at scale.

Abstract

Diffusion models (DM) have become fundamental components of generative models, excelling across various domains such as image creation, audio generation, and complex data interpolation. Signal-to-Noise diffusion models constitute a diverse family covering most state-of-the-art diffusion models. While there have been several attempts to study Signal-to-Noise (S2N) diffusion models from various perspectives, there remains a need for a comprehensive study connecting different viewpoints and exploring new perspectives. In this study, we offer a comprehensive perspective on noise schedulers, examining their role through the lens of the signal-to-noise ratio (SNR) and its connections to information theory. Building upon this framework, we have developed a generalized backward equation to enhance the performance of the inference process.

Connective Viewpoints of Signal-to-Noise Diffusion Models

TL;DR

The paper addresses unifying the diverse viewpoints of Signal-to-Noise (S2N) diffusion models by connecting forward SDEs, backward SDEs, non-Markovian continuous variational diffusion, and the information-theoretic perspective in the signal-to-noise space. It develops a forward SDE for S2N diffusion consistent with prior formulations, derives a generalized backward SDE with exact and parametric inference formulas, and extends to a non-Markovian CV model with its SDE, while mapping to SN space to provide a broad information-theoretic view (MMSE and mutual information derivatives). The authors validate the framework through deterministic and stochastic sampling experiments, showing that carefully chosen hyperparameters like , , and can improve FID on multiple benchmarks, though gains can be dataset-dependent. Overall, the connective viewpoints offer a principled, flexible lens on noise scheduling and inference in S2N diffusion models with practical implications for faster and higher-fidelity sampling at scale.

Abstract

Diffusion models (DM) have become fundamental components of generative models, excelling across various domains such as image creation, audio generation, and complex data interpolation. Signal-to-Noise diffusion models constitute a diverse family covering most state-of-the-art diffusion models. While there have been several attempts to study Signal-to-Noise (S2N) diffusion models from various perspectives, there remains a need for a comprehensive study connecting different viewpoints and exploring new perspectives. In this study, we offer a comprehensive perspective on noise schedulers, examining their role through the lens of the signal-to-noise ratio (SNR) and its connections to information theory. Building upon this framework, we have developed a generalized backward equation to enhance the performance of the inference process.
Paper Structure (25 sections, 7 theorems, 66 equations, 4 figures, 2 tables)

This paper contains 25 sections, 7 theorems, 66 equations, 4 figures, 2 tables.

Key Result

Theorem 1

With $f\left(t\right)=\frac{d\log\alpha\left(t\right)}{dt}=\frac{\alpha'\left(t\right)}{\alpha\left(t\right)}$ and $g\left(t\right)=\sqrt{-\exp\left\{ -\lambda\left(t\right)\right\} \lambda'(t)}\alpha\left(t\right)$, the SDE flow in (eq:SDE) is equivalent to the S2N forward process in (eq:SNR). More

Figures (4)

  • Figure 1: The FID $(\downarrow)$ of deterministic sampling at several CIFAR-10 $(32\times32)$ checkpoints when varying $\gamma$ with NFE = 35.
  • Figure 2: FID $(\downarrow)$ for Class-Conditional ImageNet $(64\times64)$ dataset by stochastic sampling with NFE = 511.
  • Figure 3: Grid search results in FID $(\downarrow)$ for Unconditional CIFAR-10 $(32\times32)$ (VP) by stochastic sampling when varying $\gamma$, $\delta$ with NFE = 511.
  • Figure 4: Generated images with different values of $\gamma$.

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Corollary 1
  • Theorem 4
  • Theorem 5
  • Theorem 6