Table of Contents
Fetching ...

SynFundus-1M: A High-quality Million-scale Synthetic fundus images Dataset with Fifteen Types of Annotation

Fangxin Shang, Jie Fu, Yehui Yang, Haifeng Huang, Junwei Liu, Lei Ma

TL;DR

This paper releases SynFundus-1M, a high-quality synthetic dataset containing over one million fundus images in terms of eleven disease types, and demonstrates that retinal disease diagnosis models of either convolutional neural networks (CNN) or Vision Transformer (ViT) architectures can benefit from SynFundus-1M.

Abstract

Large-scale public datasets with high-quality annotations are rarely available for intelligent medical imaging research, due to data privacy concerns and the cost of annotations. In this paper, we release SynFundus-1M, a high-quality synthetic dataset containing over one million fundus images in terms of \textbf{eleven disease types}. Furthermore, we deliberately assign four readability labels to the key regions of the fundus images. To the best of our knowledge, SynFundus-1M is currently the largest fundus dataset with the most sophisticated annotations. Leveraging over 1.3 million private authentic fundus images from various scenarios, we trained a powerful Denoising Diffusion Probabilistic Model, named SynFundus-Generator. The released SynFundus-1M are generated by SynFundus-Generator under predefined conditions. To demonstrate the value of SynFundus-1M, extensive experiments are designed in terms of the following aspect: 1) Authenticity of the images: we randomly blend the synthetic images with authentic fundus images, and find that experienced annotators can hardly distinguish the synthetic images from authentic ones. Moreover, we show that the disease-related vision features (e.g. lesions) are well simulated in the synthetic images. 2) Effectiveness for down-stream fine-tuning and pretraining: we demonstrate that retinal disease diagnosis models of either convolutional neural networks (CNN) or Vision Transformer (ViT) architectures can benefit from SynFundus-1M, and compared to the datasets commonly used for pretraining, models trained on SynFundus-1M not only achieve superior performance but also demonstrate faster convergence on various downstream tasks. SynFundus-1M is already public available for the open-source community.

SynFundus-1M: A High-quality Million-scale Synthetic fundus images Dataset with Fifteen Types of Annotation

TL;DR

This paper releases SynFundus-1M, a high-quality synthetic dataset containing over one million fundus images in terms of eleven disease types, and demonstrates that retinal disease diagnosis models of either convolutional neural networks (CNN) or Vision Transformer (ViT) architectures can benefit from SynFundus-1M.

Abstract

Large-scale public datasets with high-quality annotations are rarely available for intelligent medical imaging research, due to data privacy concerns and the cost of annotations. In this paper, we release SynFundus-1M, a high-quality synthetic dataset containing over one million fundus images in terms of \textbf{eleven disease types}. Furthermore, we deliberately assign four readability labels to the key regions of the fundus images. To the best of our knowledge, SynFundus-1M is currently the largest fundus dataset with the most sophisticated annotations. Leveraging over 1.3 million private authentic fundus images from various scenarios, we trained a powerful Denoising Diffusion Probabilistic Model, named SynFundus-Generator. The released SynFundus-1M are generated by SynFundus-Generator under predefined conditions. To demonstrate the value of SynFundus-1M, extensive experiments are designed in terms of the following aspect: 1) Authenticity of the images: we randomly blend the synthetic images with authentic fundus images, and find that experienced annotators can hardly distinguish the synthetic images from authentic ones. Moreover, we show that the disease-related vision features (e.g. lesions) are well simulated in the synthetic images. 2) Effectiveness for down-stream fine-tuning and pretraining: we demonstrate that retinal disease diagnosis models of either convolutional neural networks (CNN) or Vision Transformer (ViT) architectures can benefit from SynFundus-1M, and compared to the datasets commonly used for pretraining, models trained on SynFundus-1M not only achieve superior performance but also demonstrate faster convergence on various downstream tasks. SynFundus-1M is already public available for the open-source community.
Paper Structure (13 sections, 6 figures, 7 tables)

This paper contains 13 sections, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Illustration of the SynFundus-Generator. Generate conditions are embedded to the noise estimation procedure. The orange arrow represents the noise estimator iteratively loops through the estimation noise and generation process.
  • Figure 2: An illustration showcasing the various of readability within the SynFundus-1M dataset, where the first column is reabable images, and the remaining columns are various non-readable samples. In order to facilitate explanation and comparison, the optical disc are outlined by a red circle, the macular region is marked by a yellow circle.
  • Figure 3: Confusion matrix displaying the ability of four annotators to discern the authenticity of fundus images. Category Syn indicates synthetic images, while category Real denotes authentic images. Metrics nearing 0.5 reflect the annotators' challenges in distinguishing between images, underscoring the authenticity of SynFundus-1M.
  • Figure 4: Distribution of F1-Scores by disease category for the four annotators' evaluations, with the gray dotted line representing the baseline F1-Score of 0.5 for random guessing. Scores significantly deviate from 0.5 suggest easier identification of the image's authenticity.
  • Figure 5: Qualitative image generation comparisons. Authentic images (odd columns) are paired with SynFundus images (even columns) in various disease conditions. The disease-related visual features are annotated by red arrow. The fundus structure and lesions are very similar between authentic and synthetic images.
  • ...and 1 more figures