Table of Contents
Fetching ...

Exploring Deep Learning and Ultra-Widefield Imaging for Diabetic Retinopathy and Macular Edema

Pablo Jimenez-Lizcano, Sergio Romero-Tapiador, Ruben Tolosana, Aythami Morales, Guillermo González de Rivera, Ruben Vera-Rodriguez, Julian Fierrez

TL;DR

This study explores state-of-the-art deep learning methods and UWF imaging on three clinically relevant tasks, underscoring the competitiveness of emerging ViTs and foundation models and the promise of feature-level fusion and frequency-domain representations for UWF analysis.

Abstract

Diabetic retinopathy (DR) and diabetic macular edema (DME) are leading causes of preventable blindness among working-age adults. Traditional approaches in the literature focus on standard color fundus photography (CFP) for the detection of these conditions. Nevertheless, recent ultra-widefield imaging (UWF) offers a significantly wider field of view in comparison to CFP. Motivated by this, the present study explores state-of-the-art deep learning (DL) methods and UWF imaging on three clinically relevant tasks: i) image quality assessment for UWF, ii) identification of referable diabetic retinopathy (RDR), and iii) identification of DME. Using the publicly available UWF4DR Challenge dataset, released as part of the MICCAI 2024 conference, we benchmark DL models in the spatial (RGB) and frequency domains, including popular convolutional neural networks (CNNs) as well as recent vision transformers (ViTs) and foundation models. In addition, we explore a final feature-level fusion to increase robustness. Finally, we also analyze the decisions of the DL models using Grad-CAM, increasing the explainability. Our proposal achieves consistently strong performance across all architectures, underscoring the competitiveness of emerging ViTs and foundation models and the promise of feature-level fusion and frequency-domain representations for UWF analysis.

Exploring Deep Learning and Ultra-Widefield Imaging for Diabetic Retinopathy and Macular Edema

TL;DR

This study explores state-of-the-art deep learning methods and UWF imaging on three clinically relevant tasks, underscoring the competitiveness of emerging ViTs and foundation models and the promise of feature-level fusion and frequency-domain representations for UWF analysis.

Abstract

Diabetic retinopathy (DR) and diabetic macular edema (DME) are leading causes of preventable blindness among working-age adults. Traditional approaches in the literature focus on standard color fundus photography (CFP) for the detection of these conditions. Nevertheless, recent ultra-widefield imaging (UWF) offers a significantly wider field of view in comparison to CFP. Motivated by this, the present study explores state-of-the-art deep learning (DL) methods and UWF imaging on three clinically relevant tasks: i) image quality assessment for UWF, ii) identification of referable diabetic retinopathy (RDR), and iii) identification of DME. Using the publicly available UWF4DR Challenge dataset, released as part of the MICCAI 2024 conference, we benchmark DL models in the spatial (RGB) and frequency domains, including popular convolutional neural networks (CNNs) as well as recent vision transformers (ViTs) and foundation models. In addition, we explore a final feature-level fusion to increase robustness. Finally, we also analyze the decisions of the DL models using Grad-CAM, increasing the explainability. Our proposal achieves consistently strong performance across all architectures, underscoring the competitiveness of emerging ViTs and foundation models and the promise of feature-level fusion and frequency-domain representations for UWF analysis.
Paper Structure (10 sections, 4 figures, 2 tables)

This paper contains 10 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Graphical representation of the proposed framework, summarizing the main components and processing stages of the approach. Both RGB and frequency-domain images are considered in the analysis. Four state-of-the-art deep learning models are studied (MobileNetV2, ResNet18, ViT-B/16, and RETFound). Feature vectors are then concatenated to form a unified multimodal embedding, which is finally fed into a multilayer perceptron (MLP).
  • Figure 2: UWF4DR tasks. (A--B) Quality Assessment: (A) Gradable image with good focus, contrast, and clear structures; (B) Ungradable due to blur, media opacities, and partial eyelid obstruction. (C--D) RDR Identification: (C) Non-referable case; (D) RDR case exhibiting intraretinal hemorrhages, hard exudates, and microaneurysms. (E--F) DME Identification: (E) No DME (normal macula); (F) DME present, showing hard exudates and macular blurring due to fluid accumulation.
  • Figure 3: DFT Magnitude (clipped 99%) comparison: (A) Gradable image showing balanced frequencies; (B) Ungradable image showing concentration in low frequencies due to blur.
  • Figure 4: Grad-CAM examples. Task 1 (Quality Assessment): (A--B) Correct gradable predictions focus on optic disc/vessels; (D--E) Correct ungradable predictions highlight peripheral opacities/eyelids; (C, F) Misclassifications. Task 2 (RDR Identification): (G--H) Attention on characteristic lesions (hemorrhages, exudates). Task 3 (DME Identification): (I--J) Focus concentrated on the macular region; (K) Misclassification. Each panel displays the UWF image and its Grad-CAM heatmap.