Towards Interpretable Foundation Models for Retinal Fundus Images

Samuel Ofosu Mensah; Maria Camila Roa Carvajal; Kerol Djoumessi; Philipp Berens

Towards Interpretable Foundation Models for Retinal Fundus Images

Samuel Ofosu Mensah, Maria Camila Roa Carvajal, Kerol Djoumessi, Philipp Berens

Abstract

Foundation models are used to extract transferable representations from large amounts of unlabeled data, typically via self-supervised learning (SSL). However, many of these models rely on architectures that offer limited interpretability, which is a critical issue in high-stakes domains such as medical imaging. We propose Dual-IFM, a foundation model that is interpretable-by-design in two ways: First, it provides local interpretability for individual images through class evidence maps that are faithful to the decision-making process. Second, it provides global interpretability for entire datasets through a 2D projection layer that allows for direct visualization of the model's representation space. We trained our model on over 800,000 color fundus photography from various sources to learn generalizable, interpretable representations for different downstream tasks. Our results show that our model reaches a performance range similar to that of state-of-the-art foundation models with up to $16\times$ the number of parameters, while providing interpretable predictions on out-of-distribution data. Our results suggest that large-scale SSL pretraining paired with inherent interpretability can lead to robust representations for retinal imaging.

Towards Interpretable Foundation Models for Retinal Fundus Images

Abstract

the number of parameters, while providing interpretable predictions on out-of-distribution data. Our results suggest that large-scale SSL pretraining paired with inherent interpretability can lead to robust representations for retinal imaging.

Paper Structure (13 sections, 3 figures, 2 tables)

This paper contains 13 sections, 3 figures, 2 tables.

Introduction
Methods
Pretraining
Fine-tuning
Dual Interpretability
Experimental details
Results
Performance on eye disease classification
Global interpretability through latent space visualization
Local interpretability through class evidence maps
Discussion

Figures (3)

Figure 1: Overview of our inherently-interpretable foundation model. (A) The encoder is pretrained from a large dataset of CFP with the t-SimCNE algorithm, to learn generalizable representations and a projection that maps input images to a 2D the visualization of the encoder's representation space. (B) The model can be fine-tuned on downstream tasks while still allowing the visualization of the representation space.
Figure 2: Global interpretability via representation space visualization. 2D embeddings produced by the learned projection head for (A) APTOS (k-NN AUROC: $0.87$) and (B) Glaucoma Fundus (k-NN AUROC: $0.85$). The representation space of both dataset reveal a continuous transition of classes that mimic disease progression and potential borderline samples.
Figure 3: Local interpretability through class evidence maps. (A) Fine-tuning performance for different values of sparsity. (B) Precision score for different values of sparsity. (C, E) Example fundus images. (D, F) Class evidence maps of Dual-IFM for these images with lesion annotations overlayed (green). (G, H) Shows LRP heatmaps for RETFound.

Towards Interpretable Foundation Models for Retinal Fundus Images

Abstract

Towards Interpretable Foundation Models for Retinal Fundus Images

Authors

Abstract

Table of Contents

Figures (3)