Table of Contents
Fetching ...

On the notion of missingness for path attribution explainability methods in medical settings: Guiding the selection of medically meaningful baselines

Alexander Geiger, Lars Wagner, Daniel Rueckert, Dirk Wilhelm, Alissa Jell

TL;DR

This paper tackles explainability in medical AI by revisiting baseline choices used in path attribution methods like Integrated Gradients (IG). It introduces a principled, input-specific counterfactual baseline generated via a variational autoencoder to represent a clinically normal state, enabling more faithful attributions. The approach extends IG to counterfactual baselines (CF) and to an EG variant (EG(CF)) and demonstrates superior localization and alignment with ground-truth pathology across three medical datasets (Manometry, Chest X-ray, Brain MRI) compared with standard baselines and Latent Integrated Gradients (LIG). The results underscore the importance of semantically meaningful baselines for trustworthy explanations and offer a model-agnostic framework that can be integrated with other counterfactual methods to improve clinical interpretability.

Abstract

The explainability of deep learning models remains a significant challenge, particularly in the medical domain where interpretable outputs are critical for clinical trust and transparency. Path attribution methods such as Integrated Gradients rely on a baseline representing the absence of relevant features ("missingness"). Commonly used baselines, such as all-zero inputs, are often semantically meaningless, especially in medical contexts. While alternative baseline choices have been explored, existing methods lack a principled approach to dynamically select baselines tailored to each input. In this work, we examine the notion of missingness in the medical context, analyze its implications for baseline selection, and introduce a counterfactual-guided approach to address the limitations of conventional baselines. We argue that a generated counterfactual (i.e. clinically "normal" variation of the pathological input) represents a more accurate representation of a meaningful absence of features. We use a Variational Autoencoder in our implementation, though our concept is model-agnostic and can be applied with any suitable counterfactual method. We evaluate our concept on three distinct medical data sets and empirically demonstrate that counterfactual baselines yield more faithful and medically relevant attributions, outperforming standard baseline choices as well as other related methods.

On the notion of missingness for path attribution explainability methods in medical settings: Guiding the selection of medically meaningful baselines

TL;DR

This paper tackles explainability in medical AI by revisiting baseline choices used in path attribution methods like Integrated Gradients (IG). It introduces a principled, input-specific counterfactual baseline generated via a variational autoencoder to represent a clinically normal state, enabling more faithful attributions. The approach extends IG to counterfactual baselines (CF) and to an EG variant (EG(CF)) and demonstrates superior localization and alignment with ground-truth pathology across three medical datasets (Manometry, Chest X-ray, Brain MRI) compared with standard baselines and Latent Integrated Gradients (LIG). The results underscore the importance of semantically meaningful baselines for trustworthy explanations and offer a model-agnostic framework that can be integrated with other counterfactual methods to improve clinical interpretability.

Abstract

The explainability of deep learning models remains a significant challenge, particularly in the medical domain where interpretable outputs are critical for clinical trust and transparency. Path attribution methods such as Integrated Gradients rely on a baseline representing the absence of relevant features ("missingness"). Commonly used baselines, such as all-zero inputs, are often semantically meaningless, especially in medical contexts. While alternative baseline choices have been explored, existing methods lack a principled approach to dynamically select baselines tailored to each input. In this work, we examine the notion of missingness in the medical context, analyze its implications for baseline selection, and introduce a counterfactual-guided approach to address the limitations of conventional baselines. We argue that a generated counterfactual (i.e. clinically "normal" variation of the pathological input) represents a more accurate representation of a meaningful absence of features. We use a Variational Autoencoder in our implementation, though our concept is model-agnostic and can be applied with any suitable counterfactual method. We evaluate our concept on three distinct medical data sets and empirically demonstrate that counterfactual baselines yield more faithful and medically relevant attributions, outperforming standard baseline choices as well as other related methods.

Paper Structure

This paper contains 44 sections, 10 equations, 11 figures, 18 tables.

Figures (11)

  • Figure 1: Overview of our approach to generate a counterfactual baseline $z^*$ for input sample $x$.
  • Figure 2: Examples of attributions obtained using different baselines. Typical colormaps for the respective domains are used (i.e. blue$\rightarrow$red for the Manometry data set and black$\rightarrow$white for the other two). For EG and EG (CF) we show the mean of the sampled baselines, providing a single representation for illustration purposes.
  • Figure 3: Results of the mass-center ablation test on the three data sets. The metric shows the drop of the classifier confidence when increasingly imputing the center of mass of the attributions using different imputation methods. A lower curve indicates a better attribution. The results are aggregated (mean $\pm$ se) over all test samples.
  • Figure 4: Results of the mass-center ablation tests when comparing the additional baseline and explainability concepts
  • Figure 5: Results of the Top-k ablation tests
  • ...and 6 more figures