Table of Contents
Fetching ...

A Distributional Evaluation of Generative Image Models

Edric Tam, Barbara E Engelhardt

TL;DR

The Embedded Characteristic Score (ECS) is proposed, a comprehensive metric for evaluating the distributional match between the learned and target sample distributions, and its connection with moments and tail behavior is explored.

Abstract

Generative models are ubiquitous in modern artificial intelligence (AI) applications. Recent advances have led to a variety of generative modeling approaches that are capable of synthesizing highly realistic samples. Despite these developments, evaluating the distributional match between the synthetic samples and the target distribution in a statistically principled way remains a core challenge. We focus on evaluating image generative models, where studies often treat human evaluation as the gold standard. Commonly adopted metrics, such as the Fréchet Inception Distance (FID), do not sufficiently capture the differences between the learned and target distributions, because the assumption of normality ignores differences in the tails. We propose the Embedded Characteristic Score (ECS), a comprehensive metric for evaluating the distributional match between the learned and target sample distributions, and explore its connection with moments and tail behavior. We derive natural properties of ECS and show its practical use via simulations and an empirical study.

A Distributional Evaluation of Generative Image Models

TL;DR

The Embedded Characteristic Score (ECS) is proposed, a comprehensive metric for evaluating the distributional match between the learned and target sample distributions, and its connection with moments and tail behavior is explored.

Abstract

Generative models are ubiquitous in modern artificial intelligence (AI) applications. Recent advances have led to a variety of generative modeling approaches that are capable of synthesizing highly realistic samples. Despite these developments, evaluating the distributional match between the synthetic samples and the target distribution in a statistically principled way remains a core challenge. We focus on evaluating image generative models, where studies often treat human evaluation as the gold standard. Commonly adopted metrics, such as the Fréchet Inception Distance (FID), do not sufficiently capture the differences between the learned and target distributions, because the assumption of normality ignores differences in the tails. We propose the Embedded Characteristic Score (ECS), a comprehensive metric for evaluating the distributional match between the learned and target sample distributions, and explore its connection with moments and tail behavior. We derive natural properties of ECS and show its practical use via simulations and an empirical study.
Paper Structure (13 sections, 4 theorems, 11 equations, 2 figures, 3 tables, 1 algorithm)

This paper contains 13 sections, 4 theorems, 11 equations, 2 figures, 3 tables, 1 algorithm.

Key Result

Theorem 3.1

As $n, m \to \infty$, $\hat{r}_{f, T}(P, \tilde{P})$ converges to $r_{f, T}(P, \tilde{P})$ in probability.

Figures (2)

  • Figure 1: Sample of synthetic and real images. (a) 25 synthetic images generated from a DC-GAN model pretrained on CIFAR10 data. (b) 25 real images from the CIFAR10 testing dataset. (c) 25 synthetic images generated from a DC-GAN model pretrained on MNIST data. (d) 25 real images from the MNIST testing dataset.
  • Figure 2: Comparing Gaussian samples versus multivariate $t$ samples via two dimensional PCA plots. (a) Gaussian (red) versus Gaussian (blue). (b) Gaussian (red) versus Multivariate $t$ ($\text{df} = 100$) (blue). (c) Gaussian (red) versus Multivariate $t$ ($\text{df} = 10$) (blue). (d) Gaussian (red) versus Multivariate $t$ ($\text{df} = 5$) (blue). (e) Gaussian (red) versus Multivariate $t$ ($\text{df} = 3$) (blue). (f) Gaussian (red) versus Multivariate $t$ ($\text{df} = 2.01$) (blue).

Theorems & Definitions (9)

  • Definition 1
  • Theorem 3.1
  • proof
  • Theorem 3.2
  • proof
  • Proposition 1
  • proof
  • Lemma 6.1: durrett2019probability
  • proof