SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

Saarthak Kapse; Pushpak Pati; Srijan Das; Jingwei Zhang; Chao Chen; Maria Vakalopoulou; Joel Saltz; Dimitris Samaras; Rajarsi R. Gupta; Prateek Prasanna

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

Saarthak Kapse, Pushpak Pati, Srijan Das, Jingwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, Prateek Prasanna

TL;DR

This work introduces Self-Interpretable MIL (SI-MIL), a dual-branch framework that jointly trains a deep MIL model with a pathologist-friendly, handcrafted PathExpert feature branch to deliver inherent, feature-level interpretability for gigapixel histopathology WSIs. By tying a differentiable Top-$K$ patch selector to a linear, PathExpert-based predictor and guiding it with deep feature supervision, SI-MIL achieves competitive predictive performance across breast, lung, and colorectal cancer datasets while providing faithful, pathologist-aligned explanations. The authors validate interpretability locally (slide-level explanations) and globally (cohort-level separability), conduct domain expert studies, and release a ~2.2K-WSI dataset with nuclei maps and PathExpert features to spur reproducible interpretable MIL research. Overall, SI-MIL demonstrates that interpretability and accuracy can be jointly optimized in WSIs, enabling feature-grounded reasoning that aligns with clinical practice and paving the way for broader adoption in pathology. The work also outlines extensive ablations, dataset contributions, and visualization tools to support future work in interpretable MIL for medical imaging.

Abstract

Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides. Traditionally, MIL interpretability is limited to identifying salient regions deemed pertinent for downstream tasks, offering little insight to the end-user (pathologist) regarding the rationale behind these selections. To address this, we propose Self-Interpretable MIL (SI-MIL), a method intrinsically designed for interpretability from the very outset. SI-MIL employs a deep MIL framework to guide an interpretable branch grounded on handcrafted pathological features, facilitating linear predictions. Beyond identifying salient regions, SI-MIL uniquely provides feature-level interpretations rooted in pathological insights for WSIs. Notably, SI-MIL, with its linear prediction constraints, challenges the prevalent myth of an inevitable trade-off between model interpretability and performance, demonstrating competitive results compared to state-of-the-art methods on WSI-level prediction tasks across three cancer types. In addition, we thoroughly benchmark the local and global-interpretability of SI-MIL in terms of statistical analysis, a domain expert study, and desiderata of interpretability, namely, user-friendliness and faithfulness.

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

TL;DR

patch selector to a linear, PathExpert-based predictor and guiding it with deep feature supervision, SI-MIL achieves competitive predictive performance across breast, lung, and colorectal cancer datasets while providing faithful, pathologist-aligned explanations. The authors validate interpretability locally (slide-level explanations) and globally (cohort-level separability), conduct domain expert studies, and release a ~2.2K-WSI dataset with nuclei maps and PathExpert features to spur reproducible interpretable MIL research. Overall, SI-MIL demonstrates that interpretability and accuracy can be jointly optimized in WSIs, enabling feature-grounded reasoning that aligns with clinical practice and paving the way for broader adoption in pathology. The work also outlines extensive ablations, dataset contributions, and visualization tools to support future work in interpretable MIL for medical imaging.

Abstract

Paper Structure (35 sections, 10 equations, 16 figures, 9 tables)

This paper contains 35 sections, 10 equations, 16 figures, 9 tables.

Introduction
Related work
Method
Conventional MIL
WSI patch feature extraction
Self-Interpretable MIL (SI-MIL)
Experiments: Prediction Performance
Datasets and Implementation details
Slide-level classification performance
Experiments and Results: Interpretability
Local Interpretation: Slide-level
Global Interpretation: Cohort-level
Dataset contribution
Conclusion
Acknowledgments
...and 20 more sections

Figures (16)

Figure 1: Unlike conventional MIL, SI-MIL co-learns from deep and handcrafted features (referred to as PathExpert features). While both MILs offer patch-level interpretability, only ours provides PathExpert feature-level rationale for WSI predictions. The attention maps in SI-MIL are grounded on geometrically and physically-interpretable descriptors.
Figure 2: Overview of SI-MIL: Conventional MIL branch guides the Patch Attention-Guided Top-$K$ (PAG Top-K) patch selection module to select the PathExpert features of key regions from WSI, followed by linear scaling in the Self-Interpretable branch, and linear prediction.
Figure 3: Qualitative Patch-Feature importance report: In (a) and (b), we present WSIs with overlaid attention heatmaps and the top two patches, along with their nuclei maps. In (c), we demonstrate the mean contribution magnitude of select representative features across the top $K$ patches employed in the Self-Interpretable branch. Additionally, we display a feature density plot that quantifies the distribution of features within the $K$ patches. For brevity, we omit the y-axis. Given that these features are normalized, a curve leaning towards the right indicates higher/positive values, while one towards the left signifies lower/negative values, depending on the feature. Finally, in (d), we illustrate, the description and visualization of representative features in (c) with varying value.
Figure 4: Cohort-level Interpretation: Separability of top $K$ patches of WSIs across classes in the PathExpert feature space. Multivariate and Univariate analyses depict that the top $K$ patches selected by SI-MIL and their PathExpert features are more separable.
Figure 5: PAG Top-$K$ module ablation
...and 11 more figures

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

TL;DR

Abstract

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

Authors

TL;DR

Abstract

Table of Contents

Figures (16)