Table of Contents
Fetching ...

Generative Models with ELBOs Converging to Entropy Sums

Jan Warnken, Dmytro Velychko, Simon Damm, Asja Fischer, Jörg Lücke

TL;DR

The paper establishes that for a broad class of EF generative models, the ELBO at any stationary point converges to an entropy-sum expression, provided a parameterization criterion is met. It applies the result to several prominent models—Sigmoid Belief Networks, Gaussian-observable models including Probabilistic PCA, and various mixture models with both constant and non-constant base measures—yielding compact, computable forms such as $\mathcal{F}(\Phi,\Theta)=\frac{1}{N}\sum_n {\mathcal{H}}[q^{(n)}_{\Phi}(\vec{z})]-{\mathcal{H}}[p_{\Theta}(\vec{z})]-\mathbb{E}_{\overline{q}_{\Phi}}{\{ {\mathcal{H}}[p_{\Theta}(\vec{x}|\vec{z})]\}}$. The results enable efficient evaluation of the ELBO at stationary points (including saddles) and provide insights for model analysis and selection through entropy-based objectives. This work thus links variational objectives to entropy decompositions, offering concise expressions and practical implications for deep generative learning.

Abstract

The evidence lower bound (ELBO) is one of the most central objectives for probabilistic unsupervised learning. For the ELBOs of several generative models and model classes, we here prove convergence to entropy sums. As one result, we provide a list of generative models for which entropy convergence has been shown, so far, along with the corresponding expressions for entropy sums. Our considerations include very prominent generative models such as probabilistic PCA, sigmoid belief nets or Gaussian mixture models. However, we treat more models and entire model classes such as general mixtures of exponential family distributions. Our main contributions are the proofs for the individual models. For each given model we show that the conditions stated in Theorem 1 or Theorem 2 of [arXiv:2209.03077] are fulfilled such that by virtue of the theorems the given model's ELBO is equal to an entropy sum at all stationary points. The equality of the ELBO at stationary points applies under realistic conditions: for finite numbers of data points, for model/data mismatches, at any stationary point including saddle points etc, and it applies for any well behaved family of variational distributions.

Generative Models with ELBOs Converging to Entropy Sums

TL;DR

The paper establishes that for a broad class of EF generative models, the ELBO at any stationary point converges to an entropy-sum expression, provided a parameterization criterion is met. It applies the result to several prominent models—Sigmoid Belief Networks, Gaussian-observable models including Probabilistic PCA, and various mixture models with both constant and non-constant base measures—yielding compact, computable forms such as . The results enable efficient evaluation of the ELBO at stationary points (including saddles) and provide insights for model analysis and selection through entropy-based objectives. This work thus links variational objectives to entropy decompositions, offering concise expressions and practical implications for deep generative learning.

Abstract

The evidence lower bound (ELBO) is one of the most central objectives for probabilistic unsupervised learning. For the ELBOs of several generative models and model classes, we here prove convergence to entropy sums. As one result, we provide a list of generative models for which entropy convergence has been shown, so far, along with the corresponding expressions for entropy sums. Our considerations include very prominent generative models such as probabilistic PCA, sigmoid belief nets or Gaussian mixture models. However, we treat more models and entire model classes such as general mixtures of exponential family distributions. Our main contributions are the proofs for the individual models. For each given model we show that the conditions stated in Theorem 1 or Theorem 2 of [arXiv:2209.03077] are fulfilled such that by virtue of the theorems the given model's ELBO is equal to an entropy sum at all stationary points. The equality of the ELBO at stationary points applies under realistic conditions: for finite numbers of data points, for model/data mismatches, at any stationary point including saddle points etc, and it applies for any well behaved family of variational distributions.
Paper Structure (7 sections, 5 theorems, 46 equations)

This paper contains 7 sections, 5 theorems, 46 equations.

Key Result

Proposition 1

A Sigmoid Belief Net (Definition def:SBN) is an EF generative model which satisfies the parameterization criterion (Definition def:Param_Crit). It therefore applies at all stationary points:

Theorems & Definitions (15)

  • Definition 1: Parameterization Criterion
  • Definition 2: Sigmoid Belief Nets
  • Proposition 1: Sigmoid Belief Nets
  • proof
  • Definition 3: Gaussian observables with scalar variance
  • Proposition 2: Gaussian observables with scalar variance
  • proof
  • Definition 4: Gaussian observables with diagonal covariance
  • Proposition 3: Gaussian observables with diagonal covariance
  • proof
  • ...and 5 more