On the Convergence of the ELBO to Entropy Sums

Jörg Lücke; Jan Warnken

On the Convergence of the ELBO to Entropy Sums

Jörg Lücke, Jan Warnken

TL;DR

This work proves that for a broad class of exponential-family generative models with a suitable parameterization criterion, the ELBO evaluated at any stationary point equals a sum of entropies: the average latent-approximation entropy minus the prior entropy and the expected entropy of the observable distribution. The results are established first for exponential-family models with constant base measures and then generalized to arbitrary EF models using pseudo entropies and new measures, thereby extending entropy-sum convergence beyond Gaussian settings. The theoretical contributions unify and generalize prior Gaussian-focused insights, enabling entropy-based analyses and potential entropy-driven learning objectives across diverse models, including VAEs, SBNs, FA, GMMs, and Poisson mixtures. The findings offer a deeper information-theoretic perspective on ELBO optimization and open avenues for future work on learning objectives, optimization landscapes, and extensions to deeper or undirected generative architectures.

Abstract

The variational lower bound (a.k.a. ELBO or free energy) is the central objective for many established as well as for many novel algorithms for unsupervised learning. Such algorithms usually increase the bound until parameters have converged to values close to a stationary point of the learning dynamics. Here we show that (for a very large class of generative models) the variational lower bound is at all stationary points of learning equal to a sum of entropies. Concretely, for standard generative models with one set of latents and one set of observed variables, the sum consists of three entropies: (A) the (average) entropy of the variational distributions, (B) the negative entropy of the model's prior distribution, and (C) the (expected) negative entropy of the observable distribution. The obtained result applies under realistic conditions including: finite numbers of data points, at any stationary point (including saddle points) and for any family of (well behaved) variational distributions. The class of generative models for which we show the equality to entropy sums contains many standard as well as novel generative models including standard (Gaussian) variational autoencoders. The prerequisites we use to show equality to entropy sums are relatively mild. Concretely, the distributions defining a given generative model have to be of the exponential family, and the model has to satisfy a parameterization criterion (which is usually fulfilled). Proving equality of the ELBO to entropy sums at stationary points (under the stated conditions) is the main contribution of this work.

On the Convergence of the ELBO to Entropy Sums

TL;DR

Abstract

Paper Structure (11 sections, 11 theorems, 114 equations)

This paper contains 11 sections, 11 theorems, 114 equations.

Introduction
The Class of Considered Generative Models
Equality to Entropy Sums at Stationary Points
Convergence for General Exponential Families
Discussion
Parameterization Criterion
Proof of Theorem \ref{['th:Sum_of_Entr_Gen']}
Entropies, Log-Likelihood, and Free Energy in the New Probability Spaces
Definitions and Basic Properties
Relation to Entropy and Free Energy
Theorem \ref{['th:Sum_of_Entr_Gen']} as Generalization of Theorem \ref{['th:Sum_of_Entr']}

Key Result

Lemma 1

Consider an EF generative model as given by Definition def:EF_Gen_Model, and let the dimensionalities of the natural parameter vectors $\vec{\zeta}$ and $\vec{\eta}$ be $K$ and $L$, respectively. Let further $\frac{\partial{}\vec{\zeta}^{\mathrm{\,T}}(\vec{\Psi})}{\partial{}\vec{\Psi}}$ and $\frac{\ In the case when $\vec{z}$ is a discrete latent variable, the integrals in (EqnLemmaParamCritB1) be

Theorems & Definitions (32)

Definition A: Generative Model
Definition B: EF Generative Models
Definition C: Parameterization Criterion
Example 1: Simple SBN
Example 2: Simple Factor Analysis
Example 3: Counter-Example: Rigid SBN
Lemma 1
proof
Theorem 1: Equality to Entropy Sums
proof
...and 22 more

On the Convergence of the ELBO to Entropy Sums

TL;DR

Abstract

On the Convergence of the ELBO to Entropy Sums

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (32)