Detecting Severity of Diabetic Retinopathy from Fundus Images: A Transformer Network-based Review

Tejas Karkera; Chandranath Adak; Soumi Chattopadhyay; Muhammad Saqib

Detecting Severity of Diabetic Retinopathy from Fundus Images: A Transformer Network-based Review

Tejas Karkera, Chandranath Adak, Soumi Chattopadhyay, Muhammad Saqib

TL;DR

This work tackles automatic DR severity grading from fundus images using an ensemble of four image transformers (ViT, DeiT, CaiT, BEiT). By applying targeted preprocessing, finetuning, and two fusion strategies (weighted mean and majority voting), the authors show that the weighted-mean ensemble EiT_wm achieves the highest accuracy of 94.63% with strong agreement to human raters (kappa ≈ 0.92) on the APTOS-2019 dataset, outperforming several CNN baselines and prior transformer approaches. Ablation studies and cross-dataset pretraining further demonstrate the robustness and potential of transformer ensembles for ophthalmic image analysis. The findings suggest transformer-based ensembles can effectively capture salient retinal features for DR severity while offering interpretable localization via Grad-CAM, with future work focusing on addressing class imbalance and incorporating lesion segmentation to further enhance performance and explainability.

Abstract

Diabetic Retinopathy (DR) is considered one of the significant concerns worldwide, primarily due to its impact on causing vision loss among most people with diabetes. The severity of DR is typically comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this study, we adopt and fine-tune transformer-based learning models to capture the crucial features of retinal images for a more nuanced understanding of DR severity. Additionally, we explore the effectiveness of image transformers to infer the degree of DR severity from fundus photographs. For experiments, we utilized the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.

Detecting Severity of Diabetic Retinopathy from Fundus Images: A Transformer Network-based Review

TL;DR

Abstract

Detecting Severity of Diabetic Retinopathy from Fundus Images: A Transformer Network-based Review

Authors

TL;DR

Abstract

Table of Contents

Figures (10)