Using Skew to Assess the Quality of GAN-generated Image Features

Lorenzo Luzi; Helen Jenne; Ryan Murray; Carlos Ortiz Marrero

Using Skew to Assess the Quality of GAN-generated Image Features

Lorenzo Luzi, Helen Jenne, Ryan Murray, Carlos Ortiz Marrero

TL;DR

It is proved that SID is a pseudometric on probability distributions, show how it extends FID, and present a practical method for its computation, which shows that principal component analysis can be used to speed up the computation time of both FID and SID.

Abstract

The rapid advancement of Generative Adversarial Networks (GANs) necessitates the need to robustly evaluate these models. Among the established evaluation criteria, the FréchetInception Distance (FID) has been widely adopted due to its conceptual simplicity, fast computation time, and strong correlation with human perception. However, FID has inherent limitations, mainly stemming from its assumption that feature embeddings follow a Gaussian distribution, and therefore can be defined by their first two moments. As this does not hold in practice, in this paper we explore the importance of third-moments in image feature data and use this information to define a new measure, which we call the Skew Inception Distance (SID). We prove that SID is a pseudometric on probability distributions, show how it extends FID, and present a practical method for its computation. Our numerical experiments support that SID either tracks with FID or, in some cases, aligns more closely with human perception when evaluating image features of ImageNet data. Our work also shows that principal component analysis can be used to speed up the computation time of both FID and SID. Although we focus on using SID on image features for GAN evaluation, SID is applicable much more generally, including for the evaluation of other generative models.

Using Skew to Assess the Quality of GAN-generated Image Features

TL;DR

Abstract

Paper Structure (19 sections, 3 theorems, 27 equations, 7 figures, 2 tables)

This paper contains 19 sections, 3 theorems, 27 equations, 7 figures, 2 tables.

Introduction
Overview and summary of contributions
Extending FID to include skewness
Theoretical set-up
Defining SFD: A metric on the first three moments of probability distributions
Using dimensionality reduction to compute SID
The effect of PCA on FID
Time improvement for SID
SID Experiments
Features are skewed even after PCA
Comparing SID and FID
Conclusion
Related work
Proofs of homeomorphisms
Invertible transformations applied to the Frobenius metric
...and 4 more sections

Key Result

Proposition 1

eq:dp defines a metric $d_\mathbb P$.

Figures (7)

Figure 1: FID behaves similarly even when we project features down to lower dimensions via PCA. We show four types of distortions that are applied to ImageNet images: added Gaussian noise, salt and pepper noise, Gaussian blur, and rectangular occlusions. For the Gaussian blur, we use a kernel size of 5. In each plot, FID is shown as a function of the parameter controlling the distortion, after reducing the embeddings to dimension $d \in \{512, 256, 128, 64, 32, 16\}$. All experiments are done using Inception-v3 as a feature extractor on the entire (50K) ImageNet validation set.
Figure 2: Skew tracks more with human perception than FID. The noise levels on the images are quite low and undetectable until $\sigma > 0.015$; yet the FID increases linearly throughout. On the other hand, skew stays low for these undetectable noise levels. These were typical images taken from the 50,000 samples used to calculate FID.
Figure 3: Skew tracks more with human perception than FID. Note that the blur levels on the images are quite low and undetectable until $\sigma > 0.4$ and skew stays low for these undetectable noise levels, in contrast to FID. These were typical images taken from the 50,000 samples used to calculate the terms.
Figure 4: SID behaves similarly even when we project features down to lower dimensions via PCA. We show four types of distortions that are applied to ImageNet images: added Gaussian noise, salt and pepper noise, Gaussian blur, and rectangular occlusions. For the Gaussian blur, we use a kernel size of 5. In each plot, SID is shown as a function of the parameter controlling the distortion, after reducing the embeddings to dimension $d \in \{512, 256, 128, 64, 32, 16\}$. All experiments are done using Inception-v3 as a feature extractor on the entire (50K) ImageNet validation set.
Figure 5: SID behaves similarly on noise distortion experiments across different choices of feature extractors. We show four types of distortion applied to ImageNet images, using Inception-v3, ResNet-18, ResNet-50, and ResNeXt-101 (32$\times$8d) as feature extractors.
...and 2 more figures

Theorems & Definitions (11)

Example 1: The univariate exponential distribution
Example 2: The multivariate Gaussian
Proposition 1
proof
Theorem 1
Proposition 2
proof
Example 3
proof : Proof of Example \ref{['ex:exponential']}
proof : Proof of Example \ref{['ex:multivariate-gaussian']}
...and 1 more

Using Skew to Assess the Quality of GAN-generated Image Features

TL;DR

Abstract

Using Skew to Assess the Quality of GAN-generated Image Features

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (11)