Table of Contents
Fetching ...

GrInAdapt: Scaling Retinal Vessel Structural Map Segmentation Through Grounding, Integrating and Adapting Multi-device, Multi-site, and Multi-modal Fundus Domains

Zixuan Liu, Aaron Honjaya, Yuekai Xu, Yi Zhang, Hefu Pan, Xin Wang, Linda G Shapiro, Sheng Wang, Ruikang K Wang

TL;DR

GrInAdapt tackles cross-domain retinal vessel segmentation by addressing distribution shifts across devices and modalities with a source-free, multi-target framework. It grounds multi-view images to a shared anchor, merges region-specific predictions into robust pseudo-labels, and adapts a pre-trained model using a teacher-student scheme guided by integrated labels. Across OCTA-500 and AI-READI datasets, it achieves consistent Dice improvements (~4%) and reduced boundary errors (ASSD ~0.42 px), validating the effectiveness of grounding, integration, and adaptation components. The method is flexible to incorporate auxiliary modalities and ensemble predictions, indicating strong potential for robust, clinically applicable automated retinal vessel analysis in diverse real-world settings.

Abstract

Retinal vessel segmentation is critical for diagnosing ocular conditions, yet current deep learning methods are limited by modality-specific challenges and significant distribution shifts across imaging devices, resolutions, and anatomical regions. In this paper, we propose GrInAdapt, a novel framework for source-free multi-target domain adaptation that leverages multi-view images to refine segmentation labels and enhance model generalizability for optical coherence tomography angiography (OCTA) of the fundus of the eye. GrInAdapt follows an intuitive three-step approach: (i) grounding images to a common anchor space via registration, (ii) integrating predictions from multiple views to achieve improved label consensus, and (iii) adapting the source model to diverse target domains. Furthermore, GrInAdapt is flexible enough to incorporate auxiliary modalities such as color fundus photography, to provide complementary cues for robust vessel segmentation. Extensive experiments on a multi-device, multi-site, and multi-modal retinal dataset demonstrate that GrInAdapt significantly outperforms existing domain adaptation methods, achieving higher segmentation accuracy and robustness across multiple domains. These results highlight the potential of GrInAdapt to advance automated retinal vessel analysis and support robust clinical decision-making.

GrInAdapt: Scaling Retinal Vessel Structural Map Segmentation Through Grounding, Integrating and Adapting Multi-device, Multi-site, and Multi-modal Fundus Domains

TL;DR

GrInAdapt tackles cross-domain retinal vessel segmentation by addressing distribution shifts across devices and modalities with a source-free, multi-target framework. It grounds multi-view images to a shared anchor, merges region-specific predictions into robust pseudo-labels, and adapts a pre-trained model using a teacher-student scheme guided by integrated labels. Across OCTA-500 and AI-READI datasets, it achieves consistent Dice improvements (~4%) and reduced boundary errors (ASSD ~0.42 px), validating the effectiveness of grounding, integration, and adaptation components. The method is flexible to incorporate auxiliary modalities and ensemble predictions, indicating strong potential for robust, clinically applicable automated retinal vessel analysis in diverse real-world settings.

Abstract

Retinal vessel segmentation is critical for diagnosing ocular conditions, yet current deep learning methods are limited by modality-specific challenges and significant distribution shifts across imaging devices, resolutions, and anatomical regions. In this paper, we propose GrInAdapt, a novel framework for source-free multi-target domain adaptation that leverages multi-view images to refine segmentation labels and enhance model generalizability for optical coherence tomography angiography (OCTA) of the fundus of the eye. GrInAdapt follows an intuitive three-step approach: (i) grounding images to a common anchor space via registration, (ii) integrating predictions from multiple views to achieve improved label consensus, and (iii) adapting the source model to diverse target domains. Furthermore, GrInAdapt is flexible enough to incorporate auxiliary modalities such as color fundus photography, to provide complementary cues for robust vessel segmentation. Extensive experiments on a multi-device, multi-site, and multi-modal retinal dataset demonstrate that GrInAdapt significantly outperforms existing domain adaptation methods, achieving higher segmentation accuracy and robustness across multiple domains. These results highlight the potential of GrInAdapt to advance automated retinal vessel analysis and support robust clinical decision-making.

Paper Structure

This paper contains 25 sections, 9 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Overview of GrInAdapt. a. Paired en face images from different domains. b. Predicted vessel masks used for registration. c. Prediction from souce models used for integration. d. Three regions on retina - macula, optic disc and other. e. Different predictions are integrated based on the region. f. Refined integrated labels for adapatation. g. Adaptation process with student-teacher and adaptive label merging.
  • Figure 2: The intensity distribution for OCTA volumes from different domains (a, source domain, b-f, target domains). The frequency is accumulated and normalized across the test set from the test set of OCTA-500 (50 samples) and AI-READI datasets (80 samples) used for testing the source model and evaluating the GrInAdapt performace. Images pictured by Zeiss Cirrus machine have more simialr intensity distribution to the source Optovue domain. Images from Topcon machine (Maestro2 and Triton) have more different distribution. This indicates the performance gap between Zeiss Cirrus domains and Topcon machine domains when evaluating the source model or the adapted model.
  • Figure 3: 2D en face projection map, ground truth label, integrated label, adapted model prediction, and source model prediction for four different eyes. The first two rows highlight eyes where the Adapted Model successfully improved on the source model, and the bottom row highlights an eye where the Adapted model did not fully improve / made more mistakes.