Disentangling by Factorising
Hyunjik Kim, Andriy Mnih
TL;DR
The paper addresses unsupervised disentanglement by introducing FactorVAE, which extends the VAE objective with a Total Correlation penalty to enforce a factorial latent distribution via a discriminator-based density-ratio estimation. This approach yields better disentanglement than $\beta$-VAE for comparable reconstruction quality, while offering a more robust disentanglement metric that avoids the principal failure modes of prior metrics. Through experiments on synthetic and real datasets, FactorVAE demonstrates stronger latent-factor separation and stable training relative to InfoGAN variants. The work discusses limitations of TC-based disentanglement and proposes future directions toward handling discrete factors and mixed latent types, with implications for more controllable and transfer-ready generative models.
Abstract
We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation. We propose FactorVAE, a method that disentangles by encouraging the distribution of representations to be factorial and hence independent across the dimensions. We show that it improves upon $β$-VAE by providing a better trade-off between disentanglement and reconstruction quality. Moreover, we highlight the problems of a commonly used disentanglement metric and introduce a new metric that does not suffer from them.
