A General Model for Retinal Segmentation and Quantification
Zhonghua Wang, Lie Ju, Sijia Li, Wei Feng, Sijin Zhou, Ming Hu, Jianhao Xiong, Xiaoying Tang, Yifan Peng, Mingquan Lin, Yaodong Ding, Yong Zeng, Wenbin Wei, Li Dong, Zongyuan Ge
TL;DR
RetSAM introduces a unified retinal segmentation-to-quantification framework trained on over 200,000 fundus images to deliver robust multi-target segmentation (anatomical structures, lesions, and phenotypes) and 30 standardized biomarkers. Built on a Swin Transformer backbone with task-decoupled decoders, it uses a three-stage training pipeline—task-specific experts, pseudo-labeling on large public datasets, and private-task adaptation—to achieve strong cross-dataset and cross-modality generalization. Across 17 public benchmarks and multi-task/multi-domain settings, RetSAM demonstrates superior or competitive segmentation performance, with average DSC gains of 3.9 percentage points and up to 15 points on challenging tasks, while enabling scalable oculomics analyses. The open-source toolkit provides reproducible segmentation-to-quantification and harmonized biomarkers for population-scale retinal research and clinical translation.
Abstract
Retinal imaging is fast, non-invasive, and widely available, offering quantifiable structural and vascular signals for ophthalmic and systemic health assessment. This accessibility creates an opportunity to study how quantitative retinal phenotypes relate to ocular and systemic diseases. However, such analyses remain difficult at scale due to the limited availability of public multi-label datasets and the lack of a unified segmentation-to-quantification pipeline. We present RetSAM, a general retinal segmentation and quantification framework for fundus imaging. It delivers robust multi-target segmentation and standardized biomarker extraction, supporting downstream ophthalmologic studies and oculomics correlation analyses. Trained on over 200,000 fundus images, RetSAM supports three task categories and segments five anatomical structures, four retinal phenotypic patterns, and more than 20 distinct lesion types. It converts these segmentation results into over 30 standardized biomarkers that capture structural morphology, vascular geometry, and degenerative changes. Trained with a multi-stage strategy using both private and public fundus data, RetSAM achieves superior segmentation performance on 17 public datasets. It improves on prior best methods by 3.9 percentage points in DSC on average, with up to 15 percentage points on challenging multi-task benchmarks, and generalizes well across diverse populations, imaging devices, and clinical settings. The resulting biomarkers enable systematic correlation analyses across major ophthalmic diseases, including diabetic retinopathy, age-related macular degeneration, glaucoma, and pathologic myopia. Together, RetSAM transforms fundus images into standardized, interpretable quantitative phenotypes, enabling large-scale ophthalmic research and translation.
