A Framework for Interpretability in Machine Learning for Medical Imaging
Alan Q. Wang, Batuhan K. Karaman, Heejong Kim, Jacob Rosenthal, Rachit Saluja, Sean I. Young, Mert R. Sabuncu
TL;DR
This paper introduces a domain-specific framework for interpretability in machine learning for medical imaging (MLMI) by identifying five core elements—localizability, visual recognizability, physical attribution, model transparency, and actionability—derived from real-world medical imaging tasks and measurements. It grounds these elements in a step-by-step framework that connects use-case goals to appropriate interpretability methods, illustrated through three case studies: automated diabetic retinopathy screening, breast cancer detection, and visual encoding of brain responses. The work surveys existing interpretability methods and maps them to the identified elements, discusses intended users, faithfulness, and limitations, and highlights opportunities for future mechanistic and inductive-bias–driven approaches. Overall, the framework aims to guide clinicians and researchers in designing, validating, and deploying interpretable MLMI tools that support safety, trust, continual learning, and equitable use in real-world settings.
Abstract
Interpretability for machine learning models in medical imaging (MLMI) is an important direction of research. However, there is a general sense of murkiness in what interpretability means. Why does the need for interpretability in MLMI arise? What goals does one actually seek to address when interpretability is needed? To answer these questions, we identify a need to formalize the goals and elements of interpretability in MLMI. By reasoning about real-world tasks and goals common in both medical image analysis and its intersection with machine learning, we identify five core elements of interpretability: localization, visual recognizability, physical attribution, model transparency, and actionability. From this, we arrive at a framework for interpretability in MLMI, which serves as a step-by-step guide to approaching interpretability in this context. Overall, this paper formalizes interpretability needs in the context of medical imaging, and our applied perspective clarifies concrete MLMI-specific goals and considerations in order to guide method design and improve real-world usage. Our goal is to provide practical and didactic information for model designers and practitioners, inspire developers of models in the medical imaging field to reason more deeply about what interpretability is achieving, and suggest future directions of interpretability research.
