ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading

Zhiyuan Yang; Bo Zhang; Yufei Shi; Ningze Zhong; Johnathan Loh; Huihui Fang; Yanwu Xu; Si Yong Yeo

ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading

Zhiyuan Yang, Bo Zhang, Yufei Shi, Ningze Zhong, Johnathan Loh, Huihui Fang, Yanwu Xu, Si Yong Yeo

TL;DR

This work addresses multi-modality glaucoma grading using CFP and OCT by integrating contrastive feature learning with a Frangi vesselness auxiliary branch and an uncertainty-aware decision-level fusion based on evidence theory. The two-stage ETSCL framework first learns discriminative modality-specific embeddings via supervised contrastive loss and vessel preprocessing, then fuses the outputs through Dirichlet-based priors and Dempster's rule to produce uncertainty-aware predictions. Key contributions include (i) a triple-branch feature extractor including a Vessel branch, (ii) a Dirichlet/Evidence Theory–driven fusion mechanism with uncertainty estimation, and (iii) state-of-the-art results on the GAMMA dataset with kappa = 0.8844 and accuracy = 0.84. The approach demonstrates strong performance gains over baselines and provides a principled way to quantify modality-level uncertainty, though it is limited by dataset size and the absence of private data for broader generalization testing.

Abstract

Glaucoma is one of the leading causes of vision impairment. Digital imaging techniques, such as color fundus photography (CFP) and optical coherence tomography (OCT), provide quantitative and noninvasive methods for glaucoma diagnosis. Recently, in the field of computer-aided glaucoma diagnosis, multi-modality methods that integrate the CFP and OCT modalities have achieved greater diagnostic accuracy compared to single-modality methods. However, it remains challenging to extract reliable features due to the high similarity of medical images and the unbalanced multi-modal data distribution. Moreover, existing methods overlook the uncertainty estimation of different modalities, leading to unreliable predictions. To address these challenges, we propose a novel framework, namely ETSCL, which consists of a contrastive feature extraction stage and a decision-level fusion stage. Specifically, the supervised contrastive loss is employed to enhance the discriminative power in the feature extraction process, resulting in more effective features. In addition, we utilize the Frangi vesselness algorithm as a preprocessing step to incorporate vessel information to assist in the prediction. In the decision-level fusion stage, an evidence theory-based multi-modality classifier is employed to combine multi-source information with uncertainty estimation. Extensive experiments demonstrate that our method achieves state-of-the-art performance. The code is available at \url{https://github.com/master-Shix/ETSCL}.

ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading

TL;DR

Abstract

ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading

Authors

TL;DR

Abstract

Table of Contents

Figures (2)