Table of Contents
Fetching ...

Moment Matching Denoising Gibbs Sampling

Mingtian Zhang, Alex Hawkins-Hooker, Brooks Paige, David Barber

TL;DR

An efficient sampling framework is proposed: (pseudo)-Gibbs sampling with moment matching, which enables effective sampling from the underlying clean model when given a `noisy' model that has been well-trained via DSM.

Abstract

Energy-Based Models (EBMs) offer a versatile framework for modeling complex data distributions. However, training and sampling from EBMs continue to pose significant challenges. The widely-used Denoising Score Matching (DSM) method for scalable EBM training suffers from inconsistency issues, causing the energy model to learn a `noisy' data distribution. In this work, we propose an efficient sampling framework: (pseudo)-Gibbs sampling with moment matching, which enables effective sampling from the underlying clean model when given a `noisy' model that has been well-trained via DSM. We explore the benefits of our approach compared to related methods and demonstrate how to scale the method to high-dimensional datasets.

Moment Matching Denoising Gibbs Sampling

TL;DR

An efficient sampling framework is proposed: (pseudo)-Gibbs sampling with moment matching, which enables effective sampling from the underlying clean model when given a `noisy' model that has been well-trained via DSM.

Abstract

Energy-Based Models (EBMs) offer a versatile framework for modeling complex data distributions. However, training and sampling from EBMs continue to pose significant challenges. The widely-used Denoising Score Matching (DSM) method for scalable EBM training suffers from inconsistency issues, causing the energy model to learn a `noisy' data distribution. In this work, we propose an efficient sampling framework: (pseudo)-Gibbs sampling with moment matching, which enables effective sampling from the underlying clean model when given a `noisy' model that has been well-trained via DSM. We explore the benefits of our approach compared to related methods and demonstrate how to scale the method to high-dimensional datasets.
Paper Structure (20 sections, 5 theorems, 34 equations, 15 figures, 2 tables, 2 algorithms)

This paper contains 20 sections, 5 theorems, 34 equations, 15 figures, 2 tables, 2 algorithms.

Key Result

Theorem 2.1

When the Fisher divergence goes to 0, ${\mathrm{FD}}(\tilde{p}_d||\tilde{q}_{\theta})=0\rightarrow \tilde{p}_d=\tilde{q}_{\theta}$, there exists an unique underlying clean model $q(x)$ such that $\tilde{q}_{\theta}(\tilde{x})=\int q(x)p(\tilde{x}|x)\mathop{}\!\mathrm{d}{x}$ and $q(x)=p_d(x)$.

Figures (15)

  • Figure 1: Figure (a) shows the clean data distribution $p_d(x)$ and the corresponding noisy distribution $\tilde{p}_d(\tilde{x})$. Figure (b) shows 4 conditioned samples in the noisy space. Figures (c, d, e) visualize the true posterior $p(x|\tilde{x})$ (green) and three posterior approximations (orange). We find that only the proposed $\tilde{x}$-dependent analytical full-covariance moment matching can capture the variance of the true posterior, whereas the other two methods underestimate the variance.
  • Figure 1: MMD evaluations of a single chain
  • Figure 2: Samples from a single chain Gibbs sampling
  • Figure 2: CIFAR10 Inception and FID Scores
  • Figure 3: Figures (a,b,c) show the MNIST experiment comparisons, where we compare samples generated by pseudo-Gibbs sampling with three different $q(x|\tilde{x})$. We plot samples from 25 independent Markov Chains with $t\in\{0,1,5,10,20\}$ time steps. We can find the samples generated by the proposed analytical covariance moment matching with diagonal approximation achieved the best sample quality.
  • ...and 10 more figures

Theorems & Definitions (5)

  • Theorem 2.1: Existence of the underlying clean model for optimal $\tilde{q}_\theta(\tilde{x})$
  • Theorem 2.2: Analytical Covariance Identity
  • Theorem 2.3: Optimal Gaussian Approximation
  • Theorem A.1: Necessary and Sufficient conditions for the existence of the underlying clean model.
  • Lemma A.2: KL to Gaussian bao2022analytic