Intriguing Properties of Modern GANs
Roy Friedman, Yair Weiss
TL;DR
This work challenges the prevailing view that modern GANs learn the true data manifold by demonstrating that the GAN manifold does not pass through training examples and often lies closer to out-of-distribution images. It complements manifold analysis with density-oriented evaluation using Annealed Importance Sampling to estimate log-likelihoods, revealing that GANs assign lower likelihoods to real data than several non-GAN density models and that training data are not typical under the learned distribution. The findings show four key properties: training samples are not on the GAN manifold, the manifold's proximity favors out-of-distribution images, the generator's density poorly matches the data distribution, and training images lie outside the typical set. Together, these results suggest caution when using GANs as priors and underscore the value of likelihood-based and typical-set analyses for understanding generative models' true behavior and limitations.
Abstract
Modern GANs achieve remarkable performance in terms of generating realistic and diverse samples. This has led many to believe that ``GANs capture the training data manifold''. In this work we show that this interpretation is wrong. We empirically show that the manifold learned by modern GANs does not fit the training distribution: specifically the manifold does not pass through the training examples and passes closer to out-of-distribution images than to in-distribution images. We also investigate the distribution over images implied by the prior over the latent codes and study whether modern GANs learn a density that approximates the training distribution. Surprisingly, we find that the learned density is very far from the data distribution and that GANs tend to assign higher density to out-of-distribution images. Finally, we demonstrate that the set of images used to train modern GANs are often not part of the typical set described by the GANs' distribution.
