Table of Contents
Fetching ...

Efficient Automated Diagnosis of Retinopathy of Prematurity by Customize CNN Models

Farzan Saeedi, Sanaz Keshvari, Nasser Shoeibi

TL;DR

This study tackles automated Retinopathy of Prematurity (ROP) diagnosis under data-scarce conditions by coupling data augmentation with a customized CNN architecture and a voting system across multiple retinal views. Leveraging a MobileNetV2 backbone as a baseline, the authors design a task-specific CNN that uses 1×1 and 2×2 convolutions, Batch Normalization, and a final fully connected layer to extract ~160 features before binary classification, optimized with Binary CrossEntropy. Experimental results show that while pre-trained MobileNet benefits from normalization and augmentation, the customized CNN with voting substantially outperforms the baseline and can achieve near-perfect accuracy after fine-tuning, with favorable time-complexity trade-offs suitable for deployment on specialized hardware. The approach promises practical clinical utility by delivering accurate, efficient ROP screening and highlighting directions for extending the method to stage/zone classification and on-device deployment.

Abstract

This paper encompasses an in-depth examination of Retinopathy of Prematurity (ROP) diagnosis, employing advanced deep learning methodologies. Our focus centers on refining and evaluating CNN-based approaches for precise and efficient ROP detection. We navigate the complexities of dataset curation, preprocessing strategies, and model architecture, aligning with research objectives encompassing model effectiveness, computational cost analysis, and time complexity assessment. Results underscore the supremacy of tailored CNN models over pre-trained counterparts, evident in heightened accuracy and F1-scores. Implementation of a voting system further enhances performance. Additionally, our study reveals the potential of the proposed customized CNN model to alleviate computational burdens associated with deep neural networks. Furthermore, we showcase the feasibility of deploying these models within dedicated software and hardware configurations, highlighting their utility as valuable diagnostic aids in clinical settings. In summary, our discourse significantly contributes to ROP diagnosis, unveiling the efficacy of deep learning models in enhancing diagnostic precision and efficiency.

Efficient Automated Diagnosis of Retinopathy of Prematurity by Customize CNN Models

TL;DR

This study tackles automated Retinopathy of Prematurity (ROP) diagnosis under data-scarce conditions by coupling data augmentation with a customized CNN architecture and a voting system across multiple retinal views. Leveraging a MobileNetV2 backbone as a baseline, the authors design a task-specific CNN that uses 1×1 and 2×2 convolutions, Batch Normalization, and a final fully connected layer to extract ~160 features before binary classification, optimized with Binary CrossEntropy. Experimental results show that while pre-trained MobileNet benefits from normalization and augmentation, the customized CNN with voting substantially outperforms the baseline and can achieve near-perfect accuracy after fine-tuning, with favorable time-complexity trade-offs suitable for deployment on specialized hardware. The approach promises practical clinical utility by delivering accurate, efficient ROP screening and highlighting directions for extending the method to stage/zone classification and on-device deployment.

Abstract

This paper encompasses an in-depth examination of Retinopathy of Prematurity (ROP) diagnosis, employing advanced deep learning methodologies. Our focus centers on refining and evaluating CNN-based approaches for precise and efficient ROP detection. We navigate the complexities of dataset curation, preprocessing strategies, and model architecture, aligning with research objectives encompassing model effectiveness, computational cost analysis, and time complexity assessment. Results underscore the supremacy of tailored CNN models over pre-trained counterparts, evident in heightened accuracy and F1-scores. Implementation of a voting system further enhances performance. Additionally, our study reveals the potential of the proposed customized CNN model to alleviate computational burdens associated with deep neural networks. Furthermore, we showcase the feasibility of deploying these models within dedicated software and hardware configurations, highlighting their utility as valuable diagnostic aids in clinical settings. In summary, our discourse significantly contributes to ROP diagnosis, unveiling the efficacy of deep learning models in enhancing diagnostic precision and efficiency.

Paper Structure

This paper contains 19 sections, 1 equation, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Workflow Visualization for Automated Diagnosis of ROP using Customized CNN Models: Preprocessing, Feature Extraction, and Voting System Integration. Followed by preprocessing steps, the Augmented technique is utilized to alleviate imbalanced data challenges. The preprocessed images are then fed into the CNN models, enabling the extraction of crucial features. Our approach notably incorporates a voting system to improve ROP classification, effectively accounting for the varying angles captured in the retinal images.
  • Figure 2: The proposed CNN model architecture designed specifically for retinal image processing. It includes specialized components for effective feature extraction and classification. Convolution layers with varying stride sizes and ReLU activation functions are strategically incorporated to capture both local and global features. Batch Normalization is utilized to enhance model robustness against outliers. The fully connected layer acts as the last feature extractor layer, summarizing essential information from the retinal image. The model's output provides predictions for treatment decisions based on the extracted features.
  • Figure 3: Training and validation curves for the three best models. (a) Accuracy and (b) loss of the best MobileNet; (c) accuracy and (d) loss of Customized CNN before fine-tuning; (e) accuracy and (f) loss of Customized CNN after fine-tuning. Blue lines represent training data; red/orange lines represent validation data.
  • Figure 4: Time complexity analysis of pretrain MobileNet and customized CNN models using Frames Per Second (FPS) computation. Part (a) shows FPS for 46 images with runtimes of 10 and 100 for both models. Part (b) presents runtime for random images.