Towards Conceptual Compression
Karol Gregor, Frederic Besse, Danilo Jimenez Rezende, Ivo Danihelka, Daan Wierstra
TL;DR
This work introduces convolutional DRAW, a recurrent variational auto-encoder that produces progressively abstract visual representations, separating global concepts from fine details. By stacking latent variables and employing iterative refinement, the model achieves state-of-the-art likelihoods on Omniglot, CIFAR-10, and ImageNet, and demonstrates Conceptual Compression by storing only high-level information. The authors also present compression-oriented techniques (arithmetic coding and bits-back coding) and analyze information distribution across layers and time steps, showing early high-level information followed by detail refinement. Overall, the method advances unsupervised, latent-variable image modeling and highlights practical routes to high-quality lossy compression that aligns with human perceptual judgments.
Abstract
We introduce a simple recurrent variational auto-encoder architecture that significantly improves image modeling. The system represents the state-of-the-art in latent variable models for both the ImageNet and Omniglot datasets. We show that it naturally separates global conceptual information from lower level details, thus addressing one of the fundamentally desired properties of unsupervised learning. Furthermore, the possibility of restricting ourselves to storing only global information about an image allows us to achieve high quality 'conceptual compression'.
