Interactive Medical Image Analysis with Concept-based Similarity Reasoning
Ta Duc Huy, Sen Kim Tran, Phan Nguyen, Nguyen Hoang Tran, Tran Bao Sam, Anton van den Hengel, Zhibin Liao, Johan W. Verjans, Minh-Son To, Vu Minh Hieu Phan
TL;DR
This work introduces Concept-based Similarity Reasoning (CSR), a framework for interpretable medical image analysis that grounds patch-level prototypes to human-interpretable concepts and enables spatial doctor-in-the-loop interactions. CSR constructs an atlas of prototypes per concept and uses a cosine-based similarity reasoning pipeline to predict health conditions, with a contrastively learned projection space to improve generalization. The method supports both train-time and test-time interactions, including refining the concept atlas to mitigate shortcuts and spatially directing the model's attention via bounding-box feedback. Empirically, CSR achieves up to 4.5% F1 improvement across three biomedical datasets, with doctor refinement boosting trust metrics (Pointing Game) and interactive feedback enhancing prediction quality, suggesting strong potential for safer, more transparent clinical deployment.
Abstract
The ability to interpret and intervene model decisions is important for the adoption of computer-aided diagnosis methods in clinical workflows. Recent concept-based methods link the model predictions with interpretable concepts and modify their activation scores to interact with the model. However, these concepts are at the image level, which hinders the model from pinpointing the exact patches the concepts are activated. Alternatively, prototype-based methods learn representations from training image patches and compare these with test image patches, using the similarity scores for final class prediction. However, interpreting the underlying concepts of these patches can be challenging and often necessitates post-hoc guesswork. To address this issue, this paper introduces the novel Concept-based Similarity Reasoning network (CSR), which offers (i) patch-level prototype with intrinsic concept interpretation, and (ii) spatial interactivity. First, the proposed CSR provides localized explanation by grounding prototypes of each concept on image regions. Second, our model introduces novel spatial-level interaction, allowing doctors to engage directly with specific image areas, making it an intuitive and transparent tool for medical imaging. CSR improves upon prior state-of-the-art interpretable methods by up to 4.5\% across three biomedical datasets. Our code is released at https://github.com/tadeephuy/InteractCSR.
