MAPLES-DR: MESSIDOR Anatomical and Pathological Labels for Explainable Screening of Diabetic Retinopathy
Gabriel Lepetit-Aimon, Clément Playout, Marie Carole Boucher, Renaud Duval, Michael H Brent, Farida Cheriet
TL;DR
Diabetic retinopathy screening benefits from explainable AI, but existing datasets lack dense pixel-level biomarkers to explain model decisions. MAPLES-DR adds 10 anatomical and pathological retinal biomarkers with precise pixel-wise segmentations to 198 images from MESSIDOR-2, alongside DR/ME grading under Canadian teleophthalmology guidelines. The paper details annotation workflow, interobserver variability, data stratification, and provides train/test splits and semantic maps (multilabel and multiclass), plus supplementary materials and open-source tools to reproduce and extend annotations. They also demonstrate baseline segmentation performance and show transfer learning benefits when pretraining on MAPLES-DR for other datasets like IDRiD, highlighting the dataset's value for explainable DR screening. The dataset and tools align labels with ophthalmologic vocabulary to improve clinical interpretability and support explainable AI in DR.
Abstract
Reliable automatic diagnosis of Diabetic Retinopathy (DR) and Macular Edema (ME) is an invaluable asset in improving the rate of monitored patients among at-risk populations and in enabling earlier treatments before the pathology progresses and threatens vision. However, the explainability of screening models is still an open question, and specifically designed datasets are required to support the research. We present MAPLES-DR (MESSIDOR Anatomical and Pathological Labels for Explainable Screening of Diabetic Retinopathy), which contains, for 198 images of the MESSIDOR public fundus dataset, new diagnoses for DR and ME as well as new pixel-wise segmentation maps for 10 anatomical and pathological biomarkers related to DR. This paper documents the design choices and the annotation procedure that produced MAPLES-DR, discusses the interobserver variability and the overall quality of the annotations, and provides guidelines on using the dataset in a machine learning context.
