RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

Ze Chen; Gongyu Zhang; Jiayu Huo; Joan Nunez do Rio; Charalampos Komninos; Yang Liu; Rachel Sparks; Sebastien Ourselin; Christos Bergeles; Timothy Jackson

RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

Ze Chen, Gongyu Zhang, Jiayu Huo, Joan Nunez do Rio, Charalampos Komninos, Yang Liu, Rachel Sparks, Sebastien Ourselin, Christos Bergeles, Timothy Jackson

TL;DR

RetiGen tackles domain shift in retinal disease diagnosis by exploiting unlabeled multi-view fundus images. It combines three components—PDC to balance pseudo-labels, TSD to refine predictions at test time with a memory-augmented, consistency-based scheme, and MVLCE to local-cluster and ensemble multi-view embeddings—creating a robust framework for online and offline deployment. The method demonstrates improved domain generalization when integrated with existing DG approaches and provides ablations showing the complementary benefits of TSD and MVLCE, with notable gains on a multi-view retinal dataset. These advances offer practical impact for deploying reliable retinal diagnostics across diverse clinics and imaging devices, reducing the need for labeled target-domain data. The training objective can be summarized by $L_t = L_t^{ce} + L_t^{div}$, reflecting the balance between predictive accuracy and diversity regularization.

Abstract

This study introduces a novel framework for enhancing domain generalization in medical imaging, specifically focusing on utilizing unlabelled multi-view colour fundus photographs. Unlike traditional approaches that rely on single-view imaging data and face challenges in generalizing across diverse clinical settings, our method leverages the rich information in the unlabelled multi-view imaging data to improve model robustness and accuracy. By incorporating a class balancing method, a test-time adaptation technique and a multi-view optimization strategy, we address the critical issue of domain shift that often hampers the performance of machine learning models in real-world applications. Experiments comparing various state-of-the-art domain generalization and test-time optimization methodologies show that our approach consistently outperforms when combined with existing baseline and state-of-the-art methods. We also show our online method improves all existing techniques. Our framework demonstrates improvements in domain generalization capabilities and offers a practical solution for real-world deployment by facilitating online adaptation to new, unseen datasets. Our code is available at https://github.com/zgy600/RetiGen .

RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

TL;DR

, reflecting the balance between predictive accuracy and diversity regularization.

Abstract

Paper Structure (13 sections, 9 equations, 2 figures, 3 tables)

This paper contains 13 sections, 9 equations, 2 figures, 3 tables.

Introduction
Methodology
Pseudo-label Based Distribution Calibration (PDC)
Test-time Self-Distillation with regularization (TSD)
Weak-strong consistency and self-distillation
Memory queue
Diversity regularization
The overall loss
Multi-View Local Clustering and Ensembling (MVLCE)
Multi-view Local Clustering
Multi-view Ensembling
Results and Discussion
Conclusion

Figures (2)

Figure 1: The figure presents our experimental setup, emphasizing the utilization of multi-view images in the target domain, specifically incorporating four distinct views for each patient: the field centred on the macula ($V_1$), the field centred on the optic disc ($V_2$), and the fields tangent to the upper and lower horizontal lines of the optic disc, respectively ($V_3$ and $V_4$).
Figure 2: Overview of the proposed domain-generalization framework, beginning with a pseudo-label-based selection from a multi-view target dataset at the top (PDC). Mini-batch images undergo augmentation before encoding in feature space for test-time self-distillation (TSD), enhancing the source model. The process concludes with ensembling refined multi-view image embeddings (MVLCE) of the same patient, integrating diverse perspectives for improved diagnosis accuracy.

RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

TL;DR

Abstract

RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

Authors

TL;DR

Abstract

Table of Contents

Figures (2)