The ELBO of Variational Autoencoders Converges to a Sum of Three Entropies
Simon Damm, Dennis Forster, Dmytro Velychko, Zhenwen Dai, Asja Fischer, Jörg Lücke
TL;DR
This work proves that for standard Gaussian VAEs, the ELBO at any stationary point equals the sum of three entropies: the encoder entropy, the prior entropy, and the decoder entropy, making the bound computable in closed form from the encoder and decoder variances. It introduces a reparameterized VAE (VAE-2) with a learnable prior covariance to connect to the classic VAE-1 and extends the result to general Gaussian VAEs (including VAE-3 with latent-dependent decoder covariance). The authors validate the theory with extensive experiments across linear, nonlinear, and complex VAEs on diverse data, showing the entropy-sum bound tracks the ELBO with high accuracy near convergence and offering entropy-based tools for ELBO estimation, model selection, and posterior collapse analysis. The entropy perspective provides a principled framework to interpret VAE learning dynamics, connect optimization to the volumes of typical sets, and enable practical methods for monitoring and selecting models in streaming and large-scale settings.
Abstract
The central objective function of a variational autoencoder (VAE) is its variational lower bound (the ELBO). Here we show that for standard (i.e., Gaussian) VAEs the ELBO converges to a value given by the sum of three entropies: the (negative) entropy of the prior distribution, the expected (negative) entropy of the observable distribution, and the average entropy of the variational distributions (the latter is already part of the ELBO). Our derived analytical results are exact and apply for small as well as for intricate deep networks for encoder and decoder. Furthermore, they apply for finitely and infinitely many data points and at any stationary point (including local maxima and saddle points). The result implies that the ELBO can for standard VAEs often be computed in closed-form at stationary points while the original ELBO requires numerical approximations of integrals. As a main contribution, we provide the proof that the ELBO for VAEs is at stationary points equal to entropy sums. Numerical experiments then show that the obtained analytical results are sufficiently precise also in those vicinities of stationary points that are reached in practice. Furthermore, we discuss how the novel entropy form of the ELBO can be used to analyze and understand learning behavior. More generally, we believe that our contributions can be useful for future theoretical and practical studies on VAE learning as they provide novel information on those points in parameters space that optimization of VAEs converges to.
