Table of Contents
Fetching ...

RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

Ze Chen, Gongyu Zhang, Jiayu Huo, Joan Nunez do Rio, Charalampos Komninos, Yang Liu, Rachel Sparks, Sebastien Ourselin, Christos Bergeles, Timothy Jackson

TL;DR

RetiGen tackles domain shift in retinal disease diagnosis by exploiting unlabeled multi-view fundus images. It combines three components—PDC to balance pseudo-labels, TSD to refine predictions at test time with a memory-augmented, consistency-based scheme, and MVLCE to local-cluster and ensemble multi-view embeddings—creating a robust framework for online and offline deployment. The method demonstrates improved domain generalization when integrated with existing DG approaches and provides ablations showing the complementary benefits of TSD and MVLCE, with notable gains on a multi-view retinal dataset. These advances offer practical impact for deploying reliable retinal diagnostics across diverse clinics and imaging devices, reducing the need for labeled target-domain data. The training objective can be summarized by $L_t = L_t^{ce} + L_t^{div}$, reflecting the balance between predictive accuracy and diversity regularization.

Abstract

This study introduces a novel framework for enhancing domain generalization in medical imaging, specifically focusing on utilizing unlabelled multi-view colour fundus photographs. Unlike traditional approaches that rely on single-view imaging data and face challenges in generalizing across diverse clinical settings, our method leverages the rich information in the unlabelled multi-view imaging data to improve model robustness and accuracy. By incorporating a class balancing method, a test-time adaptation technique and a multi-view optimization strategy, we address the critical issue of domain shift that often hampers the performance of machine learning models in real-world applications. Experiments comparing various state-of-the-art domain generalization and test-time optimization methodologies show that our approach consistently outperforms when combined with existing baseline and state-of-the-art methods. We also show our online method improves all existing techniques. Our framework demonstrates improvements in domain generalization capabilities and offers a practical solution for real-world deployment by facilitating online adaptation to new, unseen datasets. Our code is available at https://github.com/zgy600/RetiGen .

RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

TL;DR

RetiGen tackles domain shift in retinal disease diagnosis by exploiting unlabeled multi-view fundus images. It combines three components—PDC to balance pseudo-labels, TSD to refine predictions at test time with a memory-augmented, consistency-based scheme, and MVLCE to local-cluster and ensemble multi-view embeddings—creating a robust framework for online and offline deployment. The method demonstrates improved domain generalization when integrated with existing DG approaches and provides ablations showing the complementary benefits of TSD and MVLCE, with notable gains on a multi-view retinal dataset. These advances offer practical impact for deploying reliable retinal diagnostics across diverse clinics and imaging devices, reducing the need for labeled target-domain data. The training objective can be summarized by , reflecting the balance between predictive accuracy and diversity regularization.

Abstract

This study introduces a novel framework for enhancing domain generalization in medical imaging, specifically focusing on utilizing unlabelled multi-view colour fundus photographs. Unlike traditional approaches that rely on single-view imaging data and face challenges in generalizing across diverse clinical settings, our method leverages the rich information in the unlabelled multi-view imaging data to improve model robustness and accuracy. By incorporating a class balancing method, a test-time adaptation technique and a multi-view optimization strategy, we address the critical issue of domain shift that often hampers the performance of machine learning models in real-world applications. Experiments comparing various state-of-the-art domain generalization and test-time optimization methodologies show that our approach consistently outperforms when combined with existing baseline and state-of-the-art methods. We also show our online method improves all existing techniques. Our framework demonstrates improvements in domain generalization capabilities and offers a practical solution for real-world deployment by facilitating online adaptation to new, unseen datasets. Our code is available at https://github.com/zgy600/RetiGen .
Paper Structure (13 sections, 9 equations, 2 figures, 3 tables)

This paper contains 13 sections, 9 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The figure presents our experimental setup, emphasizing the utilization of multi-view images in the target domain, specifically incorporating four distinct views for each patient: the field centred on the macula ($V_1$), the field centred on the optic disc ($V_2$), and the fields tangent to the upper and lower horizontal lines of the optic disc, respectively ($V_3$ and $V_4$).
  • Figure 2: Overview of the proposed domain-generalization framework, beginning with a pseudo-label-based selection from a multi-view target dataset at the top (PDC). Mini-batch images undergo augmentation before encoding in feature space for test-time self-distillation (TSD), enhancing the source model. The process concludes with ensembling refined multi-view image embeddings (MVLCE) of the same patient, integrating diverse perspectives for improved diagnosis accuracy.