Table of Contents
Fetching ...

A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis

Cristiano Patrício, Luís F. Teixeira, João C. Neves

TL;DR

This paper tackles interpretability and annotation burden in skin lesion diagnosis by proposing a two-step, training-free framework: a pretrained Vision-Language Model predicts dermoscopic concepts from images, and an off-the-shelf Large Language Model generates a diagnosis grounded in those concepts via tailored prompts. The method supports test-time human intervention, enabling corrections to predicted concepts and enhancing transparency. Across PH$^2$, Derm7pt, and HAM10000, the approach outperforms traditional Concept Bottleneck Models and state-of-the-art explainable methods while requiring no training and only a few annotated prompts. The results demonstrate strong interpretability, practical utility in clinical settings, and easy extension to new concepts, with potential applicability to other medical imaging domains.

Abstract

The main challenges hindering the adoption of deep learning-based systems in clinical settings are the scarcity of annotated data and the lack of interpretability and trust in these systems. Concept Bottleneck Models (CBMs) offer inherent interpretability by constraining the final disease prediction on a set of human-understandable concepts. However, this inherent interpretability comes at the cost of greater annotation burden. Additionally, adding new concepts requires retraining the entire system. In this work, we introduce a novel two-step methodology that addresses both of these challenges. By simulating the two stages of a CBM, we utilize a pretrained Vision Language Model (VLM) to automatically predict clinical concepts, and an off-the-shelf Large Language Model (LLM) to generate disease diagnoses based on the predicted concepts. Furthermore, our approach supports test-time human intervention, enabling corrections to predicted concepts, which improves final diagnoses and enhances transparency in decision-making. We validate our approach on three skin lesion datasets, demonstrating that it outperforms traditional CBMs and state-of-the-art explainable methods, all without requiring any training and utilizing only a few annotated examples. The code is available at https://github.com/CristianoPatricio/2-step-concept-based-skin-diagnosis.

A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis

TL;DR

This paper tackles interpretability and annotation burden in skin lesion diagnosis by proposing a two-step, training-free framework: a pretrained Vision-Language Model predicts dermoscopic concepts from images, and an off-the-shelf Large Language Model generates a diagnosis grounded in those concepts via tailored prompts. The method supports test-time human intervention, enabling corrections to predicted concepts and enhancing transparency. Across PH, Derm7pt, and HAM10000, the approach outperforms traditional Concept Bottleneck Models and state-of-the-art explainable methods while requiring no training and only a few annotated prompts. The results demonstrate strong interpretability, practical utility in clinical settings, and easy extension to new concepts, with potential applicability to other medical imaging domains.

Abstract

The main challenges hindering the adoption of deep learning-based systems in clinical settings are the scarcity of annotated data and the lack of interpretability and trust in these systems. Concept Bottleneck Models (CBMs) offer inherent interpretability by constraining the final disease prediction on a set of human-understandable concepts. However, this inherent interpretability comes at the cost of greater annotation burden. Additionally, adding new concepts requires retraining the entire system. In this work, we introduce a novel two-step methodology that addresses both of these challenges. By simulating the two stages of a CBM, we utilize a pretrained Vision Language Model (VLM) to automatically predict clinical concepts, and an off-the-shelf Large Language Model (LLM) to generate disease diagnoses based on the predicted concepts. Furthermore, our approach supports test-time human intervention, enabling corrections to predicted concepts, which improves final diagnoses and enhances transparency in decision-making. We validate our approach on three skin lesion datasets, demonstrating that it outperforms traditional CBMs and state-of-the-art explainable methods, all without requiring any training and utilizing only a few annotated examples. The code is available at https://github.com/CristianoPatricio/2-step-concept-based-skin-diagnosis.

Paper Structure

This paper contains 28 sections, 2 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Overview of the proposed framework. The linear classifier layer (left) is replaced by a pretrained Large Language Model (LLM) (right), which grounds its responses on clinical concepts predicted by a pretrained vision-language model (VLM). This approach is training-free and not restricted by predefined labels, allowing the LLM to generate diverse diagnostic possibilities for different diseases.
  • Figure 2: Example prompt with $K=2$ demonstration examples. Few-shot prompting is expected to improve performance as the number of demonstration examples increases.
  • Figure 3: Few-shot disease classification performance (in BACC %) across different $n$-shot settings. Each bar corresponds to an $n$-shot scenario $(n = {0, 1, 2, 4, 8})$.
  • Figure 4: Examples of skin images diagnosed using our approach. The concept-based explanations alongside the predicted disease label allow for further inspection of the skin image, especially when a concept prediction appears inconsistent with the diagnosis.
  • Figure 5: Predicted dermoscopic concepts for each target class in Derm7pt. The width of the lines indicates the frequency with which each concept is predicted for its respective class. This visualization highlights the key features associated with Nevus and melanoma, providing insights into the most relevant characteristics for each diagnosis.