DiffEx: Explaining a Classifier with Diffusion Models to Identify Microscopic Cellular Variations
Anis Bourou, Saranga Kingkor Mahanta, Thomas Boyer, Valérie Mezger, Auguste Genovesio
TL;DR
DiffEx addresses the interpretability gap in classifier decisions for biological and natural images by constructing a classifier-aware semantic latent space with encoder-conditioned diffusion, and by learning interpretable latent directions through contrastive learning. It identifies directions in latent space that meaningfully alter classifier probabilities and ranks them to produce visually faithful, disentangled explanations. The approach enables uncovering subtle cellular phenotypes in microscopy data and demonstrates stronger qualitative and quantitative performance than prior methods such as GCD, across natural and biomedical datasets. Overall, DiffEx offers a practical, diffusion-based path to understanding and leveraging classifier decisions for biomedical insight and biomarker discovery.
Abstract
In recent years, deep learning models have been extensively applied to biological data across various modalities. Discriminative deep learning models have excelled at classifying images into categories (e.g., healthy versus diseased, treated versus untreated). However, these models are often perceived as black boxes due to their complexity and lack of interpretability, limiting their application in real-world biological contexts. In biological research, explainability is essential: understanding classifier decisions and identifying subtle differences between conditions are critical for elucidating the effects of treatments, disease progression, and biological processes. To address this challenge, we propose DiffEx, a method for generating visually interpretable attributes to explain classifiers and identify microscopic cellular variations between different conditions. We demonstrate the effectiveness of DiffEx in explaining classifiers trained on natural and biological images. Furthermore, we use DiffEx to uncover phenotypic differences within microscopy datasets. By offering insights into cellular variations through classifier explanations, DiffEx has the potential to advance the understanding of diseases and aid drug discovery by identifying novel biomarkers.
