Intriguing Properties of Modern GANs

Roy Friedman; Yair Weiss

Intriguing Properties of Modern GANs

Roy Friedman, Yair Weiss

TL;DR

This work challenges the prevailing view that modern GANs learn the true data manifold by demonstrating that the GAN manifold does not pass through training examples and often lies closer to out-of-distribution images. It complements manifold analysis with density-oriented evaluation using Annealed Importance Sampling to estimate log-likelihoods, revealing that GANs assign lower likelihoods to real data than several non-GAN density models and that training data are not typical under the learned distribution. The findings show four key properties: training samples are not on the GAN manifold, the manifold's proximity favors out-of-distribution images, the generator's density poorly matches the data distribution, and training images lie outside the typical set. Together, these results suggest caution when using GANs as priors and underscore the value of likelihood-based and typical-set analyses for understanding generative models' true behavior and limitations.

Abstract

Modern GANs achieve remarkable performance in terms of generating realistic and diverse samples. This has led many to believe that ``GANs capture the training data manifold''. In this work we show that this interpretation is wrong. We empirically show that the manifold learned by modern GANs does not fit the training distribution: specifically the manifold does not pass through the training examples and passes closer to out-of-distribution images than to in-distribution images. We also investigate the distribution over images implied by the prior over the latent codes and study whether modern GANs learn a density that approximates the training distribution. Surprisingly, we find that the learned density is very far from the data distribution and that GANs tend to assign higher density to out-of-distribution images. Finally, we demonstrate that the set of images used to train modern GANs are often not part of the typical set described by the GANs' distribution.

Intriguing Properties of Modern GANs

TL;DR

Abstract

Paper Structure (19 sections, 7 equations, 12 figures, 2 tables)

This paper contains 19 sections, 7 equations, 12 figures, 2 tables.

Introduction
Background
Evaluation of Generative Models
Annealed Importance Sampling
Are GANs Good Manifold Methods?
Inference with the GAN Manifold
Are GANs Good Density Estimators?
Adding an Observation Model
Evaluation as Density Models
Typicality of Training Examples
Related Works
Discussion
Brief Explanation of the AIS Algorithm
Implementation Details
AIS Details
...and 4 more sections

Figures (12)

Figure 1: Modern GANs work amazingly well, to the point that it is expected that they capture the data manifold (left). It is typically assumed that because GANs can generate realistic images (right) that they capture the true data manifold. In this paper we show that this assumption is false.
Figure 2: Left:$\ell_2$ distance (lower is better) of reconstructions of generated and train images under StyleGAN-XL (sauer2022stylegan) trained on ImageNet (when only $z$ is optimized). The distribution of reconstruction errors for training images is distinct from the same for images generated by the GAN, implying that the training images are not part of the GAN manifold. Right: examples of projections to the GAN. Generated images are part of the manifold (and can be reconstructed), but training images are not. For training images, the reconstructed images are realistic birds but they are different birds.
Figure 3: Performance of different GANs at the task of outlier detection (top) and classification (bottom). In all cases, both the $\ell_2$ (plain) and the LPIPS (hatched, zhang2018unreasonable) distances are used. The GANs are compared to a 1-nearest neighbor (1NN) baseline. For CIFAR10, all methods are also compared with ViT-H (dosovitskiy2020image). The GANs always underperform, compared to the 1NN baseline.
Figure 4: Left:$\ell_2$ reconstruction errors (lower is better) for the reconstruction of different image groups by StyleGAN-XL trained on birds. The reconstruction error for birds or cars is similar, while there exist images that the GAN can reconstruct much better than those it was trained on. Right: examples for StyleGAN-XL reconstructions of car images and rescaled SVHN images. In all cases, the GANs capture the overall image structure, but don't retain the identity of the main object in the image, leading to projections which are far from the image whether in the training domain or not.
Figure 5: An example of a case where simply calculating the distance from the manifold might not tell the whole story. In this case, both $x_1$ and $x_2$ are the same distance from the manifold but the manifold is more abundant near $x_1$, making images in its region more plausible (left). By adding an observation model (right), queries to the model are transformed to calculating the log-likelihood under this observation model, essentially integrating over all possible areas of the manifold. With the observation model (right), $x_1$ is assigned a higher log-likelihood than $x_2$, even though they are equally distant from the manifold.
...and 7 more figures

Intriguing Properties of Modern GANs

TL;DR

Abstract

Intriguing Properties of Modern GANs

Authors

TL;DR

Abstract

Table of Contents

Figures (12)