Table of Contents
Fetching ...

Managing Diabetic Retinopathy with Deep Learning: A Data Centric Overview

Shramana Dey, Zahir Khan, T. A. PramodKumar, B. Uma Shankar, Ashis K. Dhara, Ramachandran Rajalakshmi, Rajiv Raman, Sushmita Mitra

Abstract

Diabetic Retinopathy (DR) is a serious microvascular complication of diabetes, and one of the leading causes of vision loss worldwide. Although automated detection and grading, with Deep Learning (DL), can reduce the burden on ophthalmologists, it is constrained by the limited availability of high-quality datasets. Existing repositories often remain geographically narrow, contain limited samples, and exhibit inconsistent annotations or variable image quality; thereby, restricting their clinical reliability. This paper presents a comprehensive review and comparative analysis of fundus image datasets used in the management of DR. The study evaluates their usability across key tasks, including binary classification, severity grading, lesion localization, and multi-disease screening. It also categorizes the datasets by size, accessibility, and annotation type (such as image-level, lesion-level, and multi-disease). Finally, a recently published dataset is presented as a case study to illustrate broader challenges in dataset curation and usage. The review consolidates current knowledge while highlighting persistent gaps such as the lack of standardized lesion-level annotations and longitudinal data. It also outlines recommendations for future dataset development to support clinically reliable and explainable solutions in DR screening.

Managing Diabetic Retinopathy with Deep Learning: A Data Centric Overview

Abstract

Diabetic Retinopathy (DR) is a serious microvascular complication of diabetes, and one of the leading causes of vision loss worldwide. Although automated detection and grading, with Deep Learning (DL), can reduce the burden on ophthalmologists, it is constrained by the limited availability of high-quality datasets. Existing repositories often remain geographically narrow, contain limited samples, and exhibit inconsistent annotations or variable image quality; thereby, restricting their clinical reliability. This paper presents a comprehensive review and comparative analysis of fundus image datasets used in the management of DR. The study evaluates their usability across key tasks, including binary classification, severity grading, lesion localization, and multi-disease screening. It also categorizes the datasets by size, accessibility, and annotation type (such as image-level, lesion-level, and multi-disease). Finally, a recently published dataset is presented as a case study to illustrate broader challenges in dataset curation and usage. The review consolidates current knowledge while highlighting persistent gaps such as the lack of standardized lesion-level annotations and longitudinal data. It also outlines recommendations for future dataset development to support clinically reliable and explainable solutions in DR screening.

Paper Structure

This paper contains 14 sections, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Sample fundus images depicting the eye pathologies in lesion formation for DR, like (a) microaneurysms (MA), hemorrhages (HE), hard exudates (HX), soft exudates (SX), intraretinal microvascular abnormalities (IRMA), among others, and (b) neovascularization (NV).
  • Figure 2: Representative fundus images, illustrating the severity grades of DR. (a) Healthy case, and the (b)--(e) DR-affected cases.
  • Figure 3: Key challenges in developing high-quality, robust and standardized datasets for DR screening.
  • Figure 4: Chronological overview of major DR fundus image datasets (from 2003 to 2025), showing year of release, dataset type (image-level vs. lesion annotation), and availability.
  • Figure 5: Data distribution in SaNMoD, in terms of (a) the DR severity classes (grades), and (b) samples in the different categories.
  • ...and 1 more figures