DWARF: Disease-weighted network for attention map refinement

Haozhe Luo; Aurélie Pahud de Mortanges; Oana Inel; Abraham Bernstein; Mauricio Reyes

DWARF: Disease-weighted network for attention map refinement

Haozhe Luo, Aurélie Pahud de Mortanges, Oana Inel, Abraham Bernstein, Mauricio Reyes

TL;DR

DWARF tackles interpretability in medical imaging by integrating clinicians into the training loop to refine attention maps via disease-specific guidance. It combines a pretrained Vision-Language Model with disease-specific segmentation heads and cyclic training to align explanations with findings. Across ChestX-Det, CheXlocalize, and Vindr-CXR, DWARF achieves state-of-the-art performance and more trustworthy attention maps, while clinician evaluations indicate higher confidence in AI-assisted classifications. The work also introduces Identity Enhanced Initialization to mitigate shortcut learning and discusses future directions for transferability and few-shot adaptation.

Abstract

The interpretability of deep learning is crucial for evaluating the reliability of medical imaging models and reducing the risks of inaccurate patient recommendations. This study addresses the "human out of the loop" and "trustworthiness" issues in medical image analysis by integrating medical professionals into the interpretability process. We propose a disease-weighted attention map refinement network (DWARF) that leverages expert feedback to enhance model relevance and accuracy. Our method employs cyclic training to iteratively improve diagnostic performance, generating precise and interpretable feature maps. Experimental results demonstrate significant improvements in interpretability and diagnostic accuracy across multiple medical imaging datasets. This approach fosters effective collaboration between AI systems and healthcare professionals, ultimately aiming to improve patient outcomes

DWARF: Disease-weighted network for attention map refinement

TL;DR

Abstract

Paper Structure (19 sections, 4 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 4 equations, 5 figures, 4 tables, 1 algorithm.

Introduction
Method
Architecture and training strategy
Losses and network initialization
Loss Function
Model Initialization
Experiments and Results
Dataset
Baselines
Training Details
Quantitative results
DWARF achieves SoTA results compared to other pretrained/finetuned VLM baselines.
DWARF achieves enhanced Stability and Scalability
DWARF's Independence from Extensive Annotation
DWARF Enhances Clinician Confidence in Classification Models
...and 4 more sections

Figures (5)

Figure 1: Flow chart of finetuning the classification model. Our method only trains single disease each epoch with disease name as prompt. For each disease, we add an additional head to mapping origin attention to refined segmentation map.
Figure 2: With random initialization, the model tends to directly learn shortcut results which always highlight the same area. While using IEI initialization, the model can start from pretrained VLM's attention to refine its focus.
Figure 3: DWARF demonstrates sustained learning capacity, benefiting from extended training epochs, whereas the baseline model suffers from overfitting with additional training.
Figure 4: Qualitative results of training with and without the DWARF architecture demonstrate that utilizing our DWARF framework consistently enhances the aggregation of feature maps and provides prior region information.
Figure : Training Process for DWARF

DWARF: Disease-weighted network for attention map refinement

TL;DR

Abstract

DWARF: Disease-weighted network for attention map refinement

Authors

TL;DR

Abstract

Table of Contents

Figures (5)