Table of Contents
Fetching ...

Image class translation: visual inspection of class-specific hypotheticals and classification based on translation distance

Mikyla K. Bowen, Jesse W. Wilson

TL;DR

This work tackles the explainability gap and out-of-domain sensitivity of medical image classifiers by introducing I2I-CT, which translates each input image into $K$ class-specific hypotheticals via CycleGAN for $K=2$ and StarGAN for $K>2$. Translation distances form a compact $d \,\in\, \mathbb{R}^K$ feature vector that supports visualization and simple classifiers, and in several medical datasets yields competitive or superior accuracy relative to end-to-end CNNs. Beyond classification, the approach reveals dataset biases and facilitates interpretability by visual inspection of generated hypotheticals. The results demonstrate that translation-distance classifiers can match or exceed CNN performance on multi-class tasks and offer a practical, interpretable complement to traditional black-box models in medical imaging.

Abstract

Purpose: A major barrier to the implementation of artificial intelligence for medical applications is the lack of explainability and high confidence for incorrect decisions, specifically with out-of-domain samples. We propose a generalization of image translation networks for image classification and demonstrate their potential as a more interpretable alternative to conventional black-box classifiers. Approach: We train an image2image network to translate an input image to class-specific hypotheticals, and then compare these with the input, both visually and quantitatively. Translation distances, i.e., the degree of alteration needed to conform to one class or another, are examined for clusters and trends, and used as simple low-dimensional feature vectors for classification. Results: On melanoma/benign dermoscopy images, a translation distance classifier achieved 80% accuracy using only a 2-dimensional feature space (versus 85% for a conventional CNN using a ~62,000-dimensional feature space). Visual inspection of rendered images revealed dataset biases, such as scalebars, vignetting, and pale background pigmentation in melanomas. Image distributions in translation distance space revealed a natural separation along the lines of dermatologist decision to biopsy, rather than between malignant and benign. On bone marrow cytology images, translation distance classifiers outperformed a conventional CNN in both 3-class (92% accuracy vs 89% for CNN) and 6-class (90% vs 86% for CNN) scenarios. Conclusions: This proof-of-concept shows the potential for image2image networks to go beyond artistic/stylistic changes and to expose dataset biases, perform dimension reduction and dataset visualization, and in some cases, potentially outperform conventional end-to-end CNN classifiers.

Image class translation: visual inspection of class-specific hypotheticals and classification based on translation distance

TL;DR

This work tackles the explainability gap and out-of-domain sensitivity of medical image classifiers by introducing I2I-CT, which translates each input image into class-specific hypotheticals via CycleGAN for and StarGAN for . Translation distances form a compact feature vector that supports visualization and simple classifiers, and in several medical datasets yields competitive or superior accuracy relative to end-to-end CNNs. Beyond classification, the approach reveals dataset biases and facilitates interpretability by visual inspection of generated hypotheticals. The results demonstrate that translation-distance classifiers can match or exceed CNN performance on multi-class tasks and offer a practical, interpretable complement to traditional black-box models in medical imaging.

Abstract

Purpose: A major barrier to the implementation of artificial intelligence for medical applications is the lack of explainability and high confidence for incorrect decisions, specifically with out-of-domain samples. We propose a generalization of image translation networks for image classification and demonstrate their potential as a more interpretable alternative to conventional black-box classifiers. Approach: We train an image2image network to translate an input image to class-specific hypotheticals, and then compare these with the input, both visually and quantitatively. Translation distances, i.e., the degree of alteration needed to conform to one class or another, are examined for clusters and trends, and used as simple low-dimensional feature vectors for classification. Results: On melanoma/benign dermoscopy images, a translation distance classifier achieved 80% accuracy using only a 2-dimensional feature space (versus 85% for a conventional CNN using a ~62,000-dimensional feature space). Visual inspection of rendered images revealed dataset biases, such as scalebars, vignetting, and pale background pigmentation in melanomas. Image distributions in translation distance space revealed a natural separation along the lines of dermatologist decision to biopsy, rather than between malignant and benign. On bone marrow cytology images, translation distance classifiers outperformed a conventional CNN in both 3-class (92% accuracy vs 89% for CNN) and 6-class (90% vs 86% for CNN) scenarios. Conclusions: This proof-of-concept shows the potential for image2image networks to go beyond artistic/stylistic changes and to expose dataset biases, perform dimension reduction and dataset visualization, and in some cases, potentially outperform conventional end-to-end CNN classifiers.
Paper Structure (22 sections, 20 figures, 1 table)

This paper contains 22 sections, 20 figures, 1 table.

Figures (20)

  • Figure 1: Image2image classification network architecture. An image is fed through an image translation network such as a CycleGAN above or StarGAN, where the generator and discriminator compete to make realistic images of each class. The resulting images are used for visual inspection, to compute translation (L1) distances, which can used for classification or used in a translation distance classifier.
  • Figure 2: (a) ROC Curve of an end-to-end CNN (green line) and the translation distance SVM (orange line). (b) Scatterplot of test set translation distances. (c) Scatterplot of structural similarity index measure (SSIM).
  • Figure 3: (a) Input orange images translated into the orange domain, $G_1(X)$, and apple domain, $G_2(X)$. (b) Input apple images translated into the orange domain, $G_1(X)$, and apple domain, $G_2(X)$.
  • Figure 4: Image panel of input benign images (X) translated into the benign class, $G_1$, and the malignant class,$G_2$ using different lambda values for cycle consistency (CC) and identity (ID).
  • Figure 5: Image panel of input malignant images (X) translated into the benign class, $G_1$, and the malignant class,$G_2$ using different lambda values for cycle consistency (CC) and identity (ID).
  • ...and 15 more figures