DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining

Youssof Nawar; Nouran Soliman; Moustafa Wassel; Mohamed ElHabebe; Noha Adly; Marwan Torki; Ahmed Elmassry; Islam Ahmed

DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining

Youssof Nawar, Nouran Soliman, Moustafa Wassel, Mohamed ElHabebe, Noha Adly, Marwan Torki, Ahmed Elmassry, Islam Ahmed

TL;DR

DiffuPT addresses class-imbalance in glaucoma detection by synthesizing data via diffusion models and pretraining classifiers on a balanced synthetic dataset before fine-tuning on real data. It introduces GlaucomaEgy, the largest national dataset for glaucoma detection, and demonstrates that diffusion-based augmentation outperforms GAN-based augmentation and other imbalance mitigation strategies. The results show harmonic mean improvements from $89.09\%$ to $92.59\%$ on GlaucomaEgy and consistent gains on AIROGS, indicating improved robustness and generalization. DiffuPT reduces variance in latent embeddings and provides a practical, annotation-free approach to mitigate class imbalance in medical imaging.

Abstract

Glaucoma is a progressive optic neuropathy characterized by structural damage to the optic nerve head and functional changes in the visual field. Detecting glaucoma early is crucial to preventing loss of eyesight. However, medical datasets often suffer from class imbalances, making detection more difficult for deep-learning algorithms. We use a generative-based framework to enhance glaucoma diagnosis, specifically addressing class imbalance through synthetic data generation. In addition, we collected the largest national dataset for glaucoma detection to support our study. The imbalance between normal and glaucomatous cases leads to performance degradation of classifier models. By combining our proposed framework leveraging diffusion models with a pretraining approach, we created a more robust classifier training process. This training process results in a better-performing classifier. The proposed approach shows promising results in improving the harmonic mean (sensitivity and specificity) and AUC for the roc for the glaucoma classifier. We report an improvement in the harmonic mean metric from 89.09% to 92.59% on the test set of our national dataset. We examine our method against other methods to overcome imbalance through extensive experiments. We report similar improvements on the AIROGS dataset. This study highlights that diffusion-based generation can be of great importance in tackling class imbalances in medical datasets to improve diagnostic performance.

DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining

TL;DR

Abstract

DiffuPT: Class Imbalance Mitigation for Glaucoma Detection via Diffusion Based Generation and Model Pretraining

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)