Measuring Diversity in Co-creative Image Generation

Francisco Ibarrola; Kazjon Grace

Measuring Diversity in Co-creative Image Generation

Francisco Ibarrola, Kazjon Grace

TL;DR

An alternative based on entropy of neural network encodings for comparing diversity between sets of images that does not require ground-truth knowledge and is easy to compute is proposed.

Abstract

Quality and diversity have been proposed as reasonable heuristics for assessing content generated by co-creative systems, but to date there has been little agreement around what constitutes the latter or how to measure it. Proposed approaches for assessing generative models in terms of diversity have limitations in that they compare the model's outputs to a ground truth that in the era of large pre-trained generative models might not be available, or entail an impractical number of computations. We propose an alternative based on entropy of neural network encodings for comparing diversity between sets of images that does not require ground-truth knowledge and is easy to compute. We also compare two pre-trained networks and show how the choice relates to the notion of diversity that we want to evaluate. We conclude with a discussion of the potential applications of these measures for ideation in interactive systems, model evaluation, and more broadly within computational creativity.

Measuring Diversity in Co-creative Image Generation

TL;DR

An alternative based on entropy of neural network encodings for comparing diversity between sets of images that does not require ground-truth knowledge and is easy to compute is proposed.

Abstract

Paper Structure (8 sections, 3 equations, 4 figures)

This paper contains 8 sections, 3 equations, 4 figures.

Introduction
Methods
Truncated Inception Entropy
Truncated CLIP Entropy
Open-source Implementation
Experiments
Text diversity
Discussion and Conclusions

Figures (4)

Figure 1: Image set generation process for diversity evaluation.
Figure 2: Samples of image sets generated with one of three methods to evaluate diversity behaviour.
Figure 3: Diversity values (TIE and TCE) using $K=20$ eigenvalues, for sets of images generated with four different criteria. All the between-set differences are statistically significant $(p<0.01)$ except for the TIE for the unusual and style sets.
Figure 4: TCE using $K=20$ eigenvalues, for sets of text prompt generated with three different criteria. All the between-set differences are statistically significant $(p<0.01)$.

Measuring Diversity in Co-creative Image Generation

TL;DR

Abstract

Measuring Diversity in Co-creative Image Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)