Towards a Scalable Reference-Free Evaluation of Generative Models

Azim Ospanov; Jingwei Zhang; Mohammad Jalali; Xuenan Cao; Andrej Bogdanov; Farzan Farnia

Towards a Scalable Reference-Free Evaluation of Generative Models

Azim Ospanov, Jingwei Zhang, Mohammad Jalali, Xuenan Cao, Andrej Bogdanov, Farzan Farnia

TL;DR

The random Fourier features framework is leveraged to reduce the computational price and the Fourier-based Kernel Entropy Approximation (FKEA) method is proposed and the empirical results indicate the method's scalability and interpretability applied to large-scale generative models.

Abstract

While standard evaluation scores for generative models are mostly reference-based, a reference-dependent assessment of generative models could be generally difficult due to the unavailability of applicable reference datasets. Recently, the reference-free entropy scores, VENDI and RKE, have been proposed to evaluate the diversity of generated data. However, estimating these scores from data leads to significant computational costs for large-scale generative models. In this work, we leverage the random Fourier features framework to reduce the computational price and propose the Fourier-based Kernel Entropy Approximation (FKEA) method. We utilize FKEA's approximated eigenspectrum of the kernel matrix to efficiently estimate the mentioned entropy scores. Furthermore, we show the application of FKEA's proxy eigenvectors to reveal the method's identified modes in evaluating the diversity of produced samples. We provide a stochastic implementation of the FKEA assessment algorithm with a complexity $O(n)$ linearly growing with sample size $n$. We extensively evaluate FKEA's numerical performance in application to standard image, text, and video datasets. Our empirical results indicate the method's scalability and interpretability applied to large-scale generative models. The codebase is available at https://github.com/aziksh-ospanov/FKEA.

Towards a Scalable Reference-Free Evaluation of Generative Models

TL;DR

Abstract

linearly growing with sample size

. We extensively evaluate FKEA's numerical performance in application to standard image, text, and video datasets. Our empirical results indicate the method's scalability and interpretability applied to large-scale generative models. The codebase is available at https://github.com/aziksh-ospanov/FKEA.

Paper Structure (22 sections, 6 theorems, 30 equations, 20 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 6 theorems, 30 equations, 20 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Preliminaries
Kernel Function, Kernel Covariance Matrix, and Matrix-based Rényi Entropy
Shift-Invariant Kernels and Random Fourier Features
Computational Complexity of VENDI & RKE Scores
A Scalable Fourier-based Method for Computing Kernel Entropy Scores
Numerical Results
Conclusion
Proofs
Proof of Theorem \ref{['thm:comp']}
Proof of Theorem \ref{['Theorem: FKEA-Guarantee']}
Limitations
Additional Numerical Results
Real Image Dataset Modes
...and 7 more sections

Key Result

Theorem 1

If $\mathrm{VENDI}_\alpha (K)$ for $\alpha \neq 2$ is computable by a circuit $\mathcal{C}$ of size $s(n)$ over basis $\mathcal{B}$, then $n \times n$ matrices can be multiplied by a circuit $\mathcal{C}$ of size $O(s(n))$ over basis $\mathcal{B} \cup \nabla \mathcal{B} \cup \{+, \times\}$.

Figures (20)

Figure 1: Reference-based vs. reference-free scores on two datasets of Stable Diffusion XL generated elephant images. FID, Recall, and Coverage scores (colored orange) are reference-based, whereas VENDI and RKE scores (colored blue) are reference-free. Inception.V3 is used as the backbone embedding. Reference-based metrics use 'Indian elephant' samples in ImageNet as reference data.
Figure 2: RFF-based identified clusters used in FKEA Evaluation in single-colored MNIST deng_mnist_2012 dataset with pixel embedding, Fourier feature dimension $2r=4000$ and bandwidth $\sigma = 7$. The graphs indicate increase in FKEA RKE/VENDI diversity metrics with increasing number of labels.
Figure 3: RFF-based identified clusters used in FKEA Evaluation in ImageNet dataset with DinoV2 embedding, Fourier feature dimension $2r=16k$ and Gaussian Kernel bandwidth $\sigma = 25$. The graphs indicate increase in FKEA diversity metrics with increasing number of labels per 50k samples.
Figure 4: FKEA metrics behavior under different truncation factor $\psi$ of StyleGAN3 karras_alias-free_2021 generated FFHQ samples.
Figure 5: FKEA diversity metrics with the increasing number of countries in the synthetic dataset.
...and 15 more figures

Theorems & Definitions (11)

Definition 1
Theorem 1
Remark 1
Theorem 2
Corollary 1
Remark 2
Lemma 1
Lemma 2
Lemma 3
proof : Proof of Lemma \ref{['lemma:matrixid']}
...and 1 more

Towards a Scalable Reference-Free Evaluation of Generative Models

TL;DR

Abstract

Towards a Scalable Reference-Free Evaluation of Generative Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (20)

Theorems & Definitions (11)