Table of Contents
Fetching ...

Using Skew to Assess the Quality of GAN-generated Image Features

Lorenzo Luzi, Helen Jenne, Ryan Murray, Carlos Ortiz Marrero

TL;DR

It is proved that SID is a pseudometric on probability distributions, show how it extends FID, and present a practical method for its computation, which shows that principal component analysis can be used to speed up the computation time of both FID and SID.

Abstract

The rapid advancement of Generative Adversarial Networks (GANs) necessitates the need to robustly evaluate these models. Among the established evaluation criteria, the FréchetInception Distance (FID) has been widely adopted due to its conceptual simplicity, fast computation time, and strong correlation with human perception. However, FID has inherent limitations, mainly stemming from its assumption that feature embeddings follow a Gaussian distribution, and therefore can be defined by their first two moments. As this does not hold in practice, in this paper we explore the importance of third-moments in image feature data and use this information to define a new measure, which we call the Skew Inception Distance (SID). We prove that SID is a pseudometric on probability distributions, show how it extends FID, and present a practical method for its computation. Our numerical experiments support that SID either tracks with FID or, in some cases, aligns more closely with human perception when evaluating image features of ImageNet data. Our work also shows that principal component analysis can be used to speed up the computation time of both FID and SID. Although we focus on using SID on image features for GAN evaluation, SID is applicable much more generally, including for the evaluation of other generative models.

Using Skew to Assess the Quality of GAN-generated Image Features

TL;DR

It is proved that SID is a pseudometric on probability distributions, show how it extends FID, and present a practical method for its computation, which shows that principal component analysis can be used to speed up the computation time of both FID and SID.

Abstract

The rapid advancement of Generative Adversarial Networks (GANs) necessitates the need to robustly evaluate these models. Among the established evaluation criteria, the FréchetInception Distance (FID) has been widely adopted due to its conceptual simplicity, fast computation time, and strong correlation with human perception. However, FID has inherent limitations, mainly stemming from its assumption that feature embeddings follow a Gaussian distribution, and therefore can be defined by their first two moments. As this does not hold in practice, in this paper we explore the importance of third-moments in image feature data and use this information to define a new measure, which we call the Skew Inception Distance (SID). We prove that SID is a pseudometric on probability distributions, show how it extends FID, and present a practical method for its computation. Our numerical experiments support that SID either tracks with FID or, in some cases, aligns more closely with human perception when evaluating image features of ImageNet data. Our work also shows that principal component analysis can be used to speed up the computation time of both FID and SID. Although we focus on using SID on image features for GAN evaluation, SID is applicable much more generally, including for the evaluation of other generative models.
Paper Structure (19 sections, 3 theorems, 27 equations, 7 figures, 2 tables)

This paper contains 19 sections, 3 theorems, 27 equations, 7 figures, 2 tables.

Key Result

Proposition 1

eq:dp defines a metric $d_\mathbb P$.

Figures (7)

  • Figure 1: FID behaves similarly even when we project features down to lower dimensions via PCA. We show four types of distortions that are applied to ImageNet images: added Gaussian noise, salt and pepper noise, Gaussian blur, and rectangular occlusions. For the Gaussian blur, we use a kernel size of 5. In each plot, FID is shown as a function of the parameter controlling the distortion, after reducing the embeddings to dimension $d \in \{512, 256, 128, 64, 32, 16\}$. All experiments are done using Inception-v3 as a feature extractor on the entire (50K) ImageNet validation set.
  • Figure 2: Skew tracks more with human perception than FID. The noise levels on the images are quite low and undetectable until $\sigma > 0.015$; yet the FID increases linearly throughout. On the other hand, skew stays low for these undetectable noise levels. These were typical images taken from the 50,000 samples used to calculate FID.
  • Figure 3: Skew tracks more with human perception than FID. Note that the blur levels on the images are quite low and undetectable until $\sigma > 0.4$ and skew stays low for these undetectable noise levels, in contrast to FID. These were typical images taken from the 50,000 samples used to calculate the terms.
  • Figure 4: SID behaves similarly even when we project features down to lower dimensions via PCA. We show four types of distortions that are applied to ImageNet images: added Gaussian noise, salt and pepper noise, Gaussian blur, and rectangular occlusions. For the Gaussian blur, we use a kernel size of 5. In each plot, SID is shown as a function of the parameter controlling the distortion, after reducing the embeddings to dimension $d \in \{512, 256, 128, 64, 32, 16\}$. All experiments are done using Inception-v3 as a feature extractor on the entire (50K) ImageNet validation set.
  • Figure 5: SID behaves similarly on noise distortion experiments across different choices of feature extractors. We show four types of distortion applied to ImageNet images, using Inception-v3, ResNet-18, ResNet-50, and ResNeXt-101 (32$\times$8d) as feature extractors.
  • ...and 2 more figures

Theorems & Definitions (11)

  • Example 1: The univariate exponential distribution
  • Example 2: The multivariate Gaussian
  • Proposition 1
  • proof
  • Theorem 1
  • Proposition 2
  • proof
  • Example 3
  • proof : Proof of Example \ref{['ex:exponential']}
  • proof : Proof of Example \ref{['ex:multivariate-gaussian']}
  • ...and 1 more