Table of Contents
Fetching ...

DiffEx: Explaining a Classifier with Diffusion Models to Identify Microscopic Cellular Variations

Anis Bourou, Saranga Kingkor Mahanta, Thomas Boyer, Valérie Mezger, Auguste Genovesio

TL;DR

DiffEx addresses the interpretability gap in classifier decisions for biological and natural images by constructing a classifier-aware semantic latent space with encoder-conditioned diffusion, and by learning interpretable latent directions through contrastive learning. It identifies directions in latent space that meaningfully alter classifier probabilities and ranks them to produce visually faithful, disentangled explanations. The approach enables uncovering subtle cellular phenotypes in microscopy data and demonstrates stronger qualitative and quantitative performance than prior methods such as GCD, across natural and biomedical datasets. Overall, DiffEx offers a practical, diffusion-based path to understanding and leveraging classifier decisions for biomedical insight and biomarker discovery.

Abstract

In recent years, deep learning models have been extensively applied to biological data across various modalities. Discriminative deep learning models have excelled at classifying images into categories (e.g., healthy versus diseased, treated versus untreated). However, these models are often perceived as black boxes due to their complexity and lack of interpretability, limiting their application in real-world biological contexts. In biological research, explainability is essential: understanding classifier decisions and identifying subtle differences between conditions are critical for elucidating the effects of treatments, disease progression, and biological processes. To address this challenge, we propose DiffEx, a method for generating visually interpretable attributes to explain classifiers and identify microscopic cellular variations between different conditions. We demonstrate the effectiveness of DiffEx in explaining classifiers trained on natural and biological images. Furthermore, we use DiffEx to uncover phenotypic differences within microscopy datasets. By offering insights into cellular variations through classifier explanations, DiffEx has the potential to advance the understanding of diseases and aid drug discovery by identifying novel biomarkers.

DiffEx: Explaining a Classifier with Diffusion Models to Identify Microscopic Cellular Variations

TL;DR

DiffEx addresses the interpretability gap in classifier decisions for biological and natural images by constructing a classifier-aware semantic latent space with encoder-conditioned diffusion, and by learning interpretable latent directions through contrastive learning. It identifies directions in latent space that meaningfully alter classifier probabilities and ranks them to produce visually faithful, disentangled explanations. The approach enables uncovering subtle cellular phenotypes in microscopy data and demonstrates stronger qualitative and quantitative performance than prior methods such as GCD, across natural and biomedical datasets. Overall, DiffEx offers a practical, diffusion-based path to understanding and leveraging classifier decisions for biomedical insight and biomarker discovery.

Abstract

In recent years, deep learning models have been extensively applied to biological data across various modalities. Discriminative deep learning models have excelled at classifying images into categories (e.g., healthy versus diseased, treated versus untreated). However, these models are often perceived as black boxes due to their complexity and lack of interpretability, limiting their application in real-world biological contexts. In biological research, explainability is essential: understanding classifier decisions and identifying subtle differences between conditions are critical for elucidating the effects of treatments, disease progression, and biological processes. To address this challenge, we propose DiffEx, a method for generating visually interpretable attributes to explain classifiers and identify microscopic cellular variations between different conditions. We demonstrate the effectiveness of DiffEx in explaining classifiers trained on natural and biological images. Furthermore, we use DiffEx to uncover phenotypic differences within microscopy datasets. By offering insights into cellular variations through classifier explanations, DiffEx has the potential to advance the understanding of diseases and aid drug discovery by identifying novel biomarkers.

Paper Structure

This paper contains 18 sections, 10 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: DiffEx primarily consists of three stages: (a) A semantic latent space is constructed by combining the embedding obtained from an encoder with the classifier's prediction for each image. The resulting representation is used to condition the DDIM. (b) Directional models are learned in this semantic latent space using a self-supervised approach. (c) After identifying the directions that most significantly affect the classification probability, we shift the images accordingly. For example, in the accompanying figure, a single image is shifted along the identified directions, resulting in visibly different images that highlight the changes induced by these directions.
  • Figure 2: Shifting images toward the opposite class using directions identified by Diffex. Left: When transforming male images toward the female class, the appearance of lipstick becomes noticeable, suggesting it as a discriminative attribute for the classifier. Right: When shifting female images toward the male class, hairstyles tend to become shorter, indicating an attribute associated with the male class. The probabilities of the target classes are shown in red.
  • Figure 3: Images from two datasets: (a) BBBC021 dataset and (b) Golgi dataset. While the differences between the two classes are apparent in BBBC021—such as the disappearance of the cytoplasm and fewer nuclei—they are more subtle in the Golgi dataset.
  • Figure 4: Shifting images toward the opposite class. Left: DiffEx identified three distinct directions for transitioning from the untreated to the treated class. Direction 1 eliminates the cytoplasm and most cells, leaving a single nucleus at the center. Direction 2 removes the cytoplasm without eliminating all nuclei. Direction 3 tends to cluster nuclei closer together and decreases the intensity of the red channel. Right: To shift from the treated to the untreated class, Direction 1 increases the intensity of the red channel and pushes nuclei apart. Direction 2 enhances the green channel, while Direction 3 increases the cell count, replicating known phenotypes
  • Figure 5: Shifting images toward the opposite class. Left: When transitioning from the treated to the untreated class, the Golgi apparatus tends to aggregate. Right: Conversely, shifting from the untreated to the treated class results in its dispersion. These observations replicate the phenotypic effects of the treatment, which induces Golgi apparatus scattering.
  • ...and 4 more figures