Table of Contents
Fetching ...

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

Ta Duc Huy, Sen Kim Tran, Phan Nguyen, Nguyen Hoang Tran, Tran Bao Sam, Anton van den Hengel, Zhibin Liao, Johan W. Verjans, Minh-Son To, Vu Minh Hieu Phan

TL;DR

This work introduces Concept-based Similarity Reasoning (CSR), a framework for interpretable medical image analysis that grounds patch-level prototypes to human-interpretable concepts and enables spatial doctor-in-the-loop interactions. CSR constructs an atlas of prototypes per concept and uses a cosine-based similarity reasoning pipeline to predict health conditions, with a contrastively learned projection space to improve generalization. The method supports both train-time and test-time interactions, including refining the concept atlas to mitigate shortcuts and spatially directing the model's attention via bounding-box feedback. Empirically, CSR achieves up to 4.5% F1 improvement across three biomedical datasets, with doctor refinement boosting trust metrics (Pointing Game) and interactive feedback enhancing prediction quality, suggesting strong potential for safer, more transparent clinical deployment.

Abstract

The ability to interpret and intervene model decisions is important for the adoption of computer-aided diagnosis methods in clinical workflows. Recent concept-based methods link the model predictions with interpretable concepts and modify their activation scores to interact with the model. However, these concepts are at the image level, which hinders the model from pinpointing the exact patches the concepts are activated. Alternatively, prototype-based methods learn representations from training image patches and compare these with test image patches, using the similarity scores for final class prediction. However, interpreting the underlying concepts of these patches can be challenging and often necessitates post-hoc guesswork. To address this issue, this paper introduces the novel Concept-based Similarity Reasoning network (CSR), which offers (i) patch-level prototype with intrinsic concept interpretation, and (ii) spatial interactivity. First, the proposed CSR provides localized explanation by grounding prototypes of each concept on image regions. Second, our model introduces novel spatial-level interaction, allowing doctors to engage directly with specific image areas, making it an intuitive and transparent tool for medical imaging. CSR improves upon prior state-of-the-art interpretable methods by up to 4.5\% across three biomedical datasets. Our code is released at https://github.com/tadeephuy/InteractCSR.

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

TL;DR

This work introduces Concept-based Similarity Reasoning (CSR), a framework for interpretable medical image analysis that grounds patch-level prototypes to human-interpretable concepts and enables spatial doctor-in-the-loop interactions. CSR constructs an atlas of prototypes per concept and uses a cosine-based similarity reasoning pipeline to predict health conditions, with a contrastively learned projection space to improve generalization. The method supports both train-time and test-time interactions, including refining the concept atlas to mitigate shortcuts and spatially directing the model's attention via bounding-box feedback. Empirically, CSR achieves up to 4.5% F1 improvement across three biomedical datasets, with doctor refinement boosting trust metrics (Pointing Game) and interactive feedback enhancing prediction quality, suggesting strong potential for safer, more transparent clinical deployment.

Abstract

The ability to interpret and intervene model decisions is important for the adoption of computer-aided diagnosis methods in clinical workflows. Recent concept-based methods link the model predictions with interpretable concepts and modify their activation scores to interact with the model. However, these concepts are at the image level, which hinders the model from pinpointing the exact patches the concepts are activated. Alternatively, prototype-based methods learn representations from training image patches and compare these with test image patches, using the similarity scores for final class prediction. However, interpreting the underlying concepts of these patches can be challenging and often necessitates post-hoc guesswork. To address this issue, this paper introduces the novel Concept-based Similarity Reasoning network (CSR), which offers (i) patch-level prototype with intrinsic concept interpretation, and (ii) spatial interactivity. First, the proposed CSR provides localized explanation by grounding prototypes of each concept on image regions. Second, our model introduces novel spatial-level interaction, allowing doctors to engage directly with specific image areas, making it an intuitive and transparent tool for medical imaging. CSR improves upon prior state-of-the-art interpretable methods by up to 4.5\% across three biomedical datasets. Our code is released at https://github.com/tadeephuy/InteractCSR.

Paper Structure

This paper contains 11 sections, 13 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Enhancing Transparency in Diagnosis Workflow with Doctor-in-the-Loop using CSR.Top section: CSR predicts Pulmonary Edema, falsely associating with the two concepts lung fluid and enlarged heart. For each concept, CSR explains its prediction by comparing the input with the highlighted region on the prototype image to create the corresponding similarity map. Bottom section: As the doctor inspects the similarity maps of each concept, he suppresses incorrect lung fluid attention, and reinforces the attention on opacity on the left of the image through spatial-interaction: "drawing positive and negative boxes" to create an importance map indicating where to focus and ignore. Secondly, noting the heart is normal, he "rejects" the enlarged heart concept via concept-interaction. CSR then recalibrates its prediction to Tuberculosis, aligning with the observed opacity.
  • Figure 2: Inference logic of CSR. Each concept prototype represents a specific concept from a training image. CSR generates 2D similarity maps by computing the cosine similarity between these concept prototypes and the feature maps, and considers the maximum values as similarity scores to calculate the prediction logits. The prototype image refers to the training image associated with a specific concept prototype as detailed in Sec. \ref{['subsec:model_exp']}
  • Figure 3: The novel Concept prototypes learning framework. We pretrain a Concept model to generate concept activation maps. The local concept vectors are then generated by weighting and summing the feature maps with the activation map of the corresponding concept. To enhance the compactness and the generalizability of the concept feature space, we introduce a novel multi-prototype learning objective. After projecting concept vectors via a projector $P$, the proposed objective pulls concept features to its nearest concept prototypes, while pushing away from prototypes of other concepts.
  • Figure 4: Contrastive learning improves the concept feature space. (a) Qualitative and (b) quantitative comparison.
  • Figure 5: The irrelevant concept prototypes to be discarded in train-time interaction, highlighted at the activated regions.
  • ...and 1 more figures