Table of Contents
Fetching ...

Exploiting XAI maps to improve MS lesion segmentation and detection in MRI

Federico Spagnolo, Nataliia Molchanova, Mario Ocampo Pineda, Lester Melie-Garcia, Meritxell Bach Cuadra, Cristina Granziera, Vincent Andrearczyk, Adrien Depeursinge

TL;DR

MS lesion segmentation with deep learning often lacks interpretability. The authors adapt instance-level XAI maps from SmoothGrad and GradCAM++ to produce lesion-specific saliency maps for a 3D U-Net and demonstrate that radiomic features extracted from these maps can distinguish true positives from false positives. A logistic regression model trained on 93 saliency-derived radiomic features improves the test F1 score from 0.7006 and PPV from 0.6265 to 0.7450 and 0.7817, with no detectable domain shift between training and test saliency maps. This approach shows that saliency maps can be leveraged to refine segmentation predictions, offering a path toward more accurate, explainable MS lesion detection in clinical practice.

Abstract

To date, several methods have been developed to explain deep learning algorithms for classification tasks. Recently, an adaptation of two of such methods has been proposed to generate instance-level explainable maps in a semantic segmentation scenario, such as multiple sclerosis (MS) lesion segmentation. In the mentioned work, a 3D U-Net was trained and tested for MS lesion segmentation, yielding an F1 score of 0.7006, and a positive predictive value (PPV) of 0.6265. The distribution of values in explainable maps exposed some differences between maps of true and false positive (TP/FP) examples. Inspired by those results, we explore in this paper the use of characteristics of lesion-specific saliency maps to refine segmentation and detection scores. We generate around 21000 maps from as many TP/FP lesions in a batch of 72 patients (training set) and 4868 from the 37 patients in the test set. 93 radiomic features extracted from the first set of maps were used to train a logistic regression model and classify TP versus FP. On the test set, F1 score and PPV were improved by a large margin when compared to the initial model, reaching 0.7450 and 0.7817, with 95% confidence intervals of [0.7358, 0.7547] and [0.7679, 0.7962], respectively. These results suggest that saliency maps can be used to refine prediction scores, boosting a model's performances.

Exploiting XAI maps to improve MS lesion segmentation and detection in MRI

TL;DR

MS lesion segmentation with deep learning often lacks interpretability. The authors adapt instance-level XAI maps from SmoothGrad and GradCAM++ to produce lesion-specific saliency maps for a 3D U-Net and demonstrate that radiomic features extracted from these maps can distinguish true positives from false positives. A logistic regression model trained on 93 saliency-derived radiomic features improves the test F1 score from 0.7006 and PPV from 0.6265 to 0.7450 and 0.7817, with no detectable domain shift between training and test saliency maps. This approach shows that saliency maps can be leveraged to refine segmentation predictions, offering a path toward more accurate, explainable MS lesion detection in clinical practice.

Abstract

To date, several methods have been developed to explain deep learning algorithms for classification tasks. Recently, an adaptation of two of such methods has been proposed to generate instance-level explainable maps in a semantic segmentation scenario, such as multiple sclerosis (MS) lesion segmentation. In the mentioned work, a 3D U-Net was trained and tested for MS lesion segmentation, yielding an F1 score of 0.7006, and a positive predictive value (PPV) of 0.6265. The distribution of values in explainable maps exposed some differences between maps of true and false positive (TP/FP) examples. Inspired by those results, we explore in this paper the use of characteristics of lesion-specific saliency maps to refine segmentation and detection scores. We generate around 21000 maps from as many TP/FP lesions in a batch of 72 patients (training set) and 4868 from the 37 patients in the test set. 93 radiomic features extracted from the first set of maps were used to train a logistic regression model and classify TP versus FP. On the test set, F1 score and PPV were improved by a large margin when compared to the initial model, reaching 0.7450 and 0.7817, with 95% confidence intervals of [0.7358, 0.7547] and [0.7679, 0.7962], respectively. These results suggest that saliency maps can be used to refine prediction scores, boosting a model's performances.
Paper Structure (14 sections, 1 equation, 5 figures, 1 table)

This paper contains 14 sections, 1 equation, 5 figures, 1 table.

Figures (5)

  • Figure 1: Violin plots representing the distribution of saliency maps maximum (a) and minimum (b) values. The four distributions refer to true negative (TN), false negative (FN), false positive (FP) and true positive (TP) volumes. Figure retrieved from spagnolo2024.
  • Figure 2: Block diagram describing how an XAI map is used to extract radiomic features. In this example, a true positive lesion is shown in the axial plane.
  • Figure 3: Comparison between mean, maximum and minimum values of saliency maps computed on the training and test set, for TP (a) and FP (b) examples.
  • Figure 4: An example of a slice in the sagittal plane from a saliency map computed on a true (a) and false (b) positive lesion, scoring 0.9398 and 0.0232 for the true positive class.
  • Figure 5: Normalized radiomic features showing the highest importance (top 10 positive on the left, top 10 negative on the right), in terms of LR coefficients. The dashed red line represents a coefficient value of 0.3, the dashed green line a coefficient value of -0.3.