Table of Contents
Fetching ...

Leveraging generative models to characterize the failure conditions of image classifiers

Adrien LeCoz, Stéphane Herbin, Faouzi Adjed

TL;DR

The capacity of producing controllable distributions of high quality image data made available by recent Generative Adversarial Networks (StyleGAN2): the failure conditions are expressed as directions of strong performance degradation in the generative model latent space to discover corner cases that combine multiple sources of corruption.

Abstract

We address in this work the question of identifying the failure conditions of a given image classifier. To do so, we exploit the capacity of producing controllable distributions of high quality image data made available by recent Generative Adversarial Networks (StyleGAN2): the failure conditions are expressed as directions of strong performance degradation in the generative model latent space. This strategy of analysis is used to discover corner cases that combine multiple sources of corruption, and to compare in more details the behavior of different classifiers. The directions of degradation can also be rendered visually by generating data for better interpretability. Some degradations such as image quality can affect all classes, whereas other ones such as shape are more class-specific. The approach is demonstrated on the MNIST dataset that has been completed by two sources of corruption: noise and blur, and shows a promising way to better understand and control the risks of exploiting Artificial Intelligence components for safety-critical applications.

Leveraging generative models to characterize the failure conditions of image classifiers

TL;DR

The capacity of producing controllable distributions of high quality image data made available by recent Generative Adversarial Networks (StyleGAN2): the failure conditions are expressed as directions of strong performance degradation in the generative model latent space to discover corner cases that combine multiple sources of corruption.

Abstract

We address in this work the question of identifying the failure conditions of a given image classifier. To do so, we exploit the capacity of producing controllable distributions of high quality image data made available by recent Generative Adversarial Networks (StyleGAN2): the failure conditions are expressed as directions of strong performance degradation in the generative model latent space. This strategy of analysis is used to discover corner cases that combine multiple sources of corruption, and to compare in more details the behavior of different classifiers. The directions of degradation can also be rendered visually by generating data for better interpretability. Some degradations such as image quality can affect all classes, whereas other ones such as shape are more class-specific. The approach is demonstrated on the MNIST dataset that has been completed by two sources of corruption: noise and blur, and shows a promising way to better understand and control the risks of exploiting Artificial Intelligence components for safety-critical applications.

Paper Structure

This paper contains 16 sections, 7 figures.

Figures (7)

  • Figure 1: An illustration of the approach. Starting from the latent space $\mathcal{S}$ of StyleGAN, we generate a population of images. The images are classified and the information on classification success is added in the space $\mathcal{S}$, where we find the dimensions discriminating well-classified vs. mis-classified images. These dimensions can then be used to visually render the corresponding influential attributes.
  • Figure 2: Samples of real corrupted data (top row) vs. generated data (bottom row)
  • Figure 3: t-SNE of generated samples in different latent spaces. $\mathcal{Z}$ does not encode the class as class information is concatenated to latent codes $z$ to form the input of the conditional generator, and does not clearly differentiate well-classified from mis-classified samples. $\mathcal{W}$ and $\mathcal{S}$ are able to separate the classes (the 10 clusters) and well-classified from mis-classified samples, $\mathcal{S}$ doing it better than $\mathcal{W}$.
  • Figure 4: (\ref{['fig:histograms_s_top']}) Histograms of values for the top 3 dimensions of $\mathcal{S}$ that discriminate the most between well-classified and mis-classified images after generation. For those top dimensions it is clear that latent codes resulting in well-classified and mis-classified images follow different distributions. (\ref{['fig:histograms_s_random']}) Histogram of values for 3 random dimensions of $\mathcal{S}$. For those dimensions, no difference is visible between the well-classified and mis-classified distributions.
  • Figure 5: Illustration of the degradation evolution starting from the same original image for the ten most influential dimensions. Each column represents one of the top ten influential dimensions; each line represents a different shift value (which also varies per dimension). More specifically, a shift reference value is defined for each dimension as the value that makes the classifier output equal to $0.50$ for the corresponding generated image, and each line represents a fraction of the dimension-specific shift reference value, written on the left as progress. Above the images are displayed the StyleSpace dimension index, an arrow representing the direction to follow (augment or reduce the value), and the classifier output for the true class.
  • ...and 2 more figures