Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models

Maciej Chrabaszcz; Hubert Baniecki; Piotr Komorowski; Szymon Płotka; Przemyslaw Biecek

Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models

Maciej Chrabaszcz, Hubert Baniecki, Piotr Komorowski, Szymon Płotka, Przemyslaw Biecek

TL;DR

3D semantic segmentation in medical imaging often remains a black box despite strong predictive performance. This paper introduces Agg^2Exp, an aggregation framework that converts fine-grained voxel attributions into interpretable local RoI and global RoI explanations, enabling global insight into how segmentation models rely on context and interact across anatomical structures. Through benchmarking gradient-based and perturbation-based attributions, the authors show gradient-based methods, especially SmoothGrad, offer better faithfulness and robustness, and they demonstrate the approach on Swin UNETR with TotalSegmentator v2, including an explanatory use-case that uncovers biases and outlier behavior in a private chest-CT dataset. The work provides a practical, scalable tool for explainability and bias detection in large 3D segmentation models, with publicly available code to support adoption and further research.

Abstract

Analysis of 3D segmentation models, especially in the context of medical imaging, is often limited to segmentation performance metrics that overlook the crucial aspect of explainability and bias. Currently, effectively explaining these models with saliency maps is challenging due to the high dimensions of input images multiplied by the ever-growing number of segmented class labels. To this end, we introduce Agg^2Exp, a methodology for aggregating fine-grained voxel attributions of the segmentation model's predictions. Unlike classical explanation methods that primarily focus on the local feature attribution, Agg^2Exp enables a more comprehensive global view on the importance of predicted segments in 3D images. Our benchmarking experiments show that gradient-based voxel attributions are more faithful to the model's predictions than perturbation-based explanations. As a concrete use-case, we apply Agg^2Exp to discover knowledge acquired by the Swin UNEt TRansformer model trained on the TotalSegmentator v2 dataset for segmenting anatomical structures in computed tomography medical images. Agg^2Exp facilitates the explanatory analysis of large segmentation models beyond their predictive performance. The source code is publicly available at https://github.com/mi2datalab/agg2exp.

Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models

TL;DR

Abstract

Paper Structure (36 sections, 11 equations, 14 figures, 3 tables)

This paper contains 36 sections, 11 equations, 14 figures, 3 tables.

Introduction
Related work
Segmentation of 3D medical images
Explanation of (2D) segmentation models
Agg$^2$Exp methodology
Voxel attribution methods for 3D segmentation
Gradient-based attributions.
Perturbation-based attributions.
Aggregating attributions for 3D segmentation
Local RoI importance.
Global RoI importance.
Eval. metrics for voxel attribution methods
Faithfulness to the model.
Sensitivity to data perturbation.
Complexity (sparsity).
...and 21 more sections

Figures (14)

Figure 1: Local fine-grained explanations of 3D semantic segmentation are inherently high-dimensional and incomprehensible to humans. We propose to aggregate voxel attributions over the segmented class labels and other regions of interest (RoIs) to provide a more comprehensive global view on the importance of predicted segments in 3D images. Positive and negative attributions between the semantic regions construct an explanatory knowledge graph induced by the black-box segmentation model. Agg$^2$Exp is especially useful in applied sciences like medical imaging, where explainability enables data bias correction and model validation beyond its predictive performance.
Figure 2: Visualization of voxel attributions explaining the prediction of aorta computed with different methods: KernelSHAP (cubes and semantic), IG, and VG. The model's prediction of the aorta is highlighted with yellow. The color mapping of attribution values transitions from positive (red) to negative (blue). Particular slices are chosen based on the highest area of the segmented class within each dimension (X, Y, Z). To improve readability, we display only the top 95% values for gradient-based methods and remove attribution values for the background segment in KernelSHAP. We show analogous explanations for other class labels and attribution methods in \ref{['app:visualization']}.
Figure 3: Distribution of local RoI importances for two class labels: ribs and pulmonary vein. We color objects by their semantic meaning, i.e., cardiovascular system in red, muscles in yellow, bones in blue, other organs in grey, and lung pathologies in black.
Figure 4: A global RoI importance explanation of the model's 3D segmentation. Aggregated attributions provide a measure of semantic importance between the segmented objects in TSV2. We visualize about one-third of the most important edges for clarity.
Figure 5: Comparison between the three patient cases with the highest (left) and lowest (right) anomaly scores in our B50 dataset.
...and 9 more figures

Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models

TL;DR

Abstract

Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models

Authors

TL;DR

Abstract

Table of Contents

Figures (14)