FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
Julia Yang, Alina Jade Barnett, Jon Donnelly, Satvik Kishore, Jerry Fang, Fides Regina Schwartz, Chaofan Chen, Joseph Y. Lo, Cynthia Rudin
TL;DR
The paper tackles the need for interpretable deep learning in radiology, focusing on mass-margin classification in digital mammography. It introduces FPN-IAIA-BL, a multi-scale, prototype-based architecture that extends IAIA-BL with a Feature Pyramid Network to learn interpretable prototypes across scales, using cosine similarity with focal pooling and a three-stage training scheme guided by a fine-annotation loss. A large Duke Health dataset with lesion and negative examples and a dedicated negative class supports training, with radiologist-informed coefficients shaping prototype activations. Empirically, FPN-IAIA-BL yields localized, scale-aware explanations and competitive interpretability, achieving an average AUROC of $0.88$ (circumscribed $0.88$, indistinct $0.87$, spiculated $0.86$; overall $0.91$), though IAIA-BL remains stronger in raw accuracy. The work contributes a general, scalable architecture for case-based explanations in computer vision that can be adapted to other high-stakes domains.
Abstract
Digital mammography is essential to breast cancer detection, and deep learning offers promising tools for faster and more accurate mammogram analysis. In radiology and other high-stakes environments, uninterpretable ("black box") deep learning models are unsuitable and there is a call in these fields to make interpretable models. Recent work in interpretable computer vision provides transparency to these formerly black boxes by utilizing prototypes for case-based explanations, achieving high accuracy in applications including mammography. However, these models struggle with precise feature localization, reasoning on large portions of an image when only a small part is relevant. This paper addresses this gap by proposing a novel multi-scale interpretable deep learning model for mammographic mass margin classification. Our contribution not only offers an interpretable model with reasoning aligned with radiologist practices, but also provides a general architecture for computer vision with user-configurable prototypes from coarse- to fine-grained prototypes.
