Symmetric Equilibrium Learning of VAEs
Boris Flach, Dmitrij Schlesinger, Alexander Shekhovtsov
TL;DR
This work introduces symmetric equilibrium learning for VAEs by framing encoder and decoder as two players in a Nash game, addressing ELBO's inherent asymmetry and its limitations with complex priors, semi-supervised data, and structured latent spaces. The method defines two utilities, $L_p$ and $L_q$, enabling learning via simple gradient updates without reparametrisation, and proves a unique, stable equilibrium for lifted exponential-family models. It extends naturally to semi-supervised, unsupervised, and hierarchical VAEs, including implicit priors, and demonstrates practical applicability through experiments on MNIST and CelebA that match ELBO performance while improving encoder–decoder consistency and enabling tasks outside ELBO's scope. The approach broadens VAE applicability to more complex data, latent structures, and sampling-based learning scenarios, with potential impact on downstream tasks requiring robust bidirectional inference.
Abstract
We view variational autoencoders (VAE) as decoder-encoder pairs, which map distributions in the data space to distributions in the latent space and vice versa. The standard learning approach for VAEs is the maximisation of the evidence lower bound (ELBO). It is asymmetric in that it aims at learning a latent variable model while using the encoder as an auxiliary means only. Moreover, it requires a closed form a-priori latent distribution. This limits its applicability in more complex scenarios, such as general semi-supervised learning and employing complex generative models as priors. We propose a Nash equilibrium learning approach, which is symmetric with respect to the encoder and decoder and allows learning VAEs in situations where both the data and the latent distributions are accessible only by sampling. The flexibility and simplicity of this approach allows its application to a wide range of learning scenarios and downstream tasks.
