Table of Contents
Fetching ...

Cross Feature Fusion of Fundus Image and Generated Lesion Map for Referable Diabetic Retinopathy Classification

Dahyun Mok, Junghyun Bum, Le Duc Tai, Hyunseung Choo

TL;DR

This paper develops an advanced cross-learning DR classification method leveraging transfer learning and cross-attention mechanisms, and employs the Swin U-Net architecture to segment lesion maps from DR fundus images.

Abstract

Diabetic Retinopathy (DR) is a primary cause of blindness, necessitating early detection and diagnosis. This paper focuses on referable DR classification to enhance the applicability of the proposed method in clinical practice. We develop an advanced cross-learning DR classification method leveraging transfer learning and cross-attention mechanisms. The proposed method employs the Swin U-Net architecture to segment lesion maps from DR fundus images. The Swin U-Net segmentation model, enriched with DR lesion insights, is transferred to generate a lesion map. Both the fundus image and its segmented lesion map are used as complementary inputs for the classification model. A cross-attention mechanism is deployed to improve the model's ability to capture fine-grained details from the input pairs. Our experiments, utilizing two public datasets, FGADR and EyePACS, demonstrate a superior accuracy of 94.6%, surpassing current state-of-the-art methods by 4.4%. To this end, we aim for the proposed method to be seamlessly integrated into clinical workflows, enhancing accuracy and efficiency in identifying referable DR.

Cross Feature Fusion of Fundus Image and Generated Lesion Map for Referable Diabetic Retinopathy Classification

TL;DR

This paper develops an advanced cross-learning DR classification method leveraging transfer learning and cross-attention mechanisms, and employs the Swin U-Net architecture to segment lesion maps from DR fundus images.

Abstract

Diabetic Retinopathy (DR) is a primary cause of blindness, necessitating early detection and diagnosis. This paper focuses on referable DR classification to enhance the applicability of the proposed method in clinical practice. We develop an advanced cross-learning DR classification method leveraging transfer learning and cross-attention mechanisms. The proposed method employs the Swin U-Net architecture to segment lesion maps from DR fundus images. The Swin U-Net segmentation model, enriched with DR lesion insights, is transferred to generate a lesion map. Both the fundus image and its segmented lesion map are used as complementary inputs for the classification model. A cross-attention mechanism is deployed to improve the model's ability to capture fine-grained details from the input pairs. Our experiments, utilizing two public datasets, FGADR and EyePACS, demonstrate a superior accuracy of 94.6%, surpassing current state-of-the-art methods by 4.4%. To this end, we aim for the proposed method to be seamlessly integrated into clinical workflows, enhancing accuracy and efficiency in identifying referable DR.

Paper Structure

This paper contains 11 sections, 3 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Example of DR signs: (a) showing microaneurysms, hemorrhages, and exudates, clearly illustrating how each lesion appears in DR images and (b) showing the examples of lesion map.
  • Figure 2: Overview of the proposed method consists of two steps.
  • Figure 3: Architecture of the proposed method: (a) using the transfer learning capabilities of Swin U-Net, generates the lesion map. (b) illustrates a Cross Attention Block where original image X and the generated lesion map Y are concatenated and transformed into Query Q, Key K, and Value V matrices to compute the final attended representation for X. In (c), the original image and the generated lesion map are input into Cross Swin-T to classify DR.
  • Figure 4: Examples of the retina on the (a) EyePACS dataset: demonstrates retinal fundus images of varying qualities and characteristics, captured using various fundus cameras, such as Canon CR-2 and Topcon NW400, (b) FGADR dataset: the first column indicates the original images, and the second column indicates the lesion maps.
  • Figure 5: Examples of applying preprocessing and data augmentation on the EyePACS dataset. The first column shows the original images and the subsequent columns sequentially show the results of applying the preprocessing on the original image.
  • ...and 2 more figures