Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

Hong Nguyen; Cuong V. Nguyen; Shrikanth Narayanan; Benjamin Y. Xu; Michael Pazzani

Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

Hong Nguyen, Cuong V. Nguyen, Shrikanth Narayanan, Benjamin Y. Xu, Michael Pazzani

TL;DR

This work reframes glaucoma severity assessment as a pairwise ranking problem over fundus images and introduces a siamese network with an n-hidden comparison mechanism to capture multiple latent severity criteria. By combining a multi-criterion scoring function with a downstream Bradley-Terry ranking and an explainable AI pipeline using SHAP and GradCAM, the approach yields higher pairwise accuracy than traditional single-score baselines and provides interpretable, image-based explanations aligned with ophthalmologist insights such as the $CD$-ratio and $MD$-index. Experiments on the OHTS fundus image dataset demonstrate that a 10-hidden configuration achieves about a 12% improvement in pairwise comparison accuracy, with competitive nDCG and qualitative saliency explanations. The results advance clinically actionable AI by delivering more nuanced severity ranking and human-interpretable justifications, supporting resource prioritization and longitudinal monitoring in glaucoma care.

Abstract

Primary open-angle glaucoma (POAG) is a chronic and progressive optic nerve condition that results in an acquired loss of optic nerve fibers and potential blindness. The gradual onset of glaucoma results in patients progressively losing their vision without being consciously aware of the changes. To diagnose POAG and determine its severity, patients must undergo a comprehensive dilated eye examination. In this work, we build a framework to rank, compare, and interpret the severity of glaucoma using fundus images. We introduce a siamese-based severity ranking using pairwise n-hidden comparisons. We additionally have a novel approach to explaining why a specific image is deemed more severe than others. Our findings indicate that the proposed severity ranking model surpasses traditional ones in terms of diagnostic accuracy and delivers improved saliency explanations.

Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

TL;DR

-ratio and

-index. Experiments on the OHTS fundus image dataset demonstrate that a 10-hidden configuration achieves about a 12% improvement in pairwise comparison accuracy, with competitive nDCG and qualitative saliency explanations. The results advance clinically actionable AI by delivering more nuanced severity ranking and human-interpretable justifications, supporting resource prioritization and longitudinal monitoring in glaucoma care.

Abstract

Paper Structure (10 sections, 2 equations, 3 figures, 1 table)

This paper contains 10 sections, 2 equations, 3 figures, 1 table.

Introduction
Related Works
Problem Statement
Methodology
n-Hidden Comparison
Explainable Framework
Experimental Setup & Results
Dataset
Results
Conclusions ad Future Works

Figures (3)

Figure 1: Paradigm Shift with AI-Enhanced Eye Care. While different ophthalmologists have different "cut off" for severity levels, they most likely agree on comparison between pairs of image. We proposed a framework to severity ranking via pairwise comparisons.
Figure 2: The three phases of our proposed framework for severity ranking via pairwise comparisons.
Figure 3: 10-hidden comparison in best IoU case. Expert annotation show that upper image is severer than lower one because of expansion of optic nerve to upper left and lower-left region of optic disc, as present by segmentation masks.

Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

TL;DR

Abstract

Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

Authors

TL;DR

Abstract

Table of Contents

Figures (3)