MEMO: Dataset and Methods for Robust Multimodal Retinal Image Registration with Large or Small Vessel Density Differences

Chiao-Yi Wang; Faranguisse Kakhi Sadrieh; Yi-Ting Shen; Shih-En Chen; Sarah Kim; Victoria Chen; Achyut Raghavendra; Dongyi Wang; Osamah Saeedi; Yang Tao

MEMO: Dataset and Methods for Robust Multimodal Retinal Image Registration with Large or Small Vessel Density Differences

Chiao-Yi Wang, Faranguisse Kakhi Sadrieh, Yi-Ting Shen, Shih-En Chen, Sarah Kim, Victoria Chen, Achyut Raghavendra, Dongyi Wang, Osamah Saeedi, Yang Tao

TL;DR

The paper tackles robust multimodal retinal image registration between EMA and OCTA to enable absolute capillary retinal blood flow measurements, a problem aggravated by large vessel-density differences ($VD$). It introduces MEMO, the first public EMA–OCTA dataset with ground-truth registration points and EMA sequences, and presents VDD-Reg, a segmentation-based registration framework that relies on a two-stage LVD-Seg training to extract vessels visible in both modalities using as few as $3$ labeled EMA masks. LVD-Seg Stage 1 uses supervised loss and Stage 2 uses style loss via Gram matrices to learn cross-modality vessel representations, after which the registration module employs SuperPoint features and a partial affine transform with RANSAC for alignment. Across CF-FA (small $VD$ differences) and MEMO (large $VD$ differences), VDD-Reg outperforms baselines in RMSE/MAE and Dice metrics, achieving high success rates and demonstrating feasibility with minimal labeling. The work enables robust cross-modality retinal registration and suggests applicability to other VD-dissimilar modality pairs, supporting broader biomedical image fusion and quantitative microvascular analysis.

Abstract

The measurement of retinal blood flow (RBF) in capillaries can provide a powerful biomarker for the early diagnosis and treatment of ocular diseases. However, no single modality can determine capillary flowrates with high precision. Combining erythrocyte-mediated angiography (EMA) with optical coherence tomography angiography (OCTA) has the potential to achieve this goal, as EMA can measure the absolute 2D RBF of retinal microvasculature and OCTA can provide the 3D structural images of capillaries. However, multimodal retinal image registration between these two modalities remains largely unexplored. To fill this gap, we establish MEMO, the first public multimodal EMA and OCTA retinal image dataset. A unique challenge in multimodal retinal image registration between these modalities is the relatively large difference in vessel density (VD). To address this challenge, we propose a segmentation-based deep-learning framework (VDD-Reg) and a new evaluation metric (MSD), which provide robust results despite differences in vessel density. VDD-Reg consists of a vessel segmentation module and a registration module. To train the vessel segmentation module, we further designed a two-stage semi-supervised learning framework (LVD-Seg) combining supervised and unsupervised losses. We demonstrate that VDD-Reg outperforms baseline methods quantitatively and qualitatively for cases of both small VD differences (using the CF-FA dataset) and large VD differences (using our MEMO dataset). Moreover, VDD-Reg requires as few as three annotated vessel segmentation masks to maintain its accuracy, demonstrating its feasibility.

MEMO: Dataset and Methods for Robust Multimodal Retinal Image Registration with Large or Small Vessel Density Differences

TL;DR

). It introduces MEMO, the first public EMA–OCTA dataset with ground-truth registration points and EMA sequences, and presents VDD-Reg, a segmentation-based registration framework that relies on a two-stage LVD-Seg training to extract vessels visible in both modalities using as few as

labeled EMA masks. LVD-Seg Stage 1 uses supervised loss and Stage 2 uses style loss via Gram matrices to learn cross-modality vessel representations, after which the registration module employs SuperPoint features and a partial affine transform with RANSAC for alignment. Across CF-FA (small

differences) and MEMO (large

differences), VDD-Reg outperforms baselines in RMSE/MAE and Dice metrics, achieving high success rates and demonstrating feasibility with minimal labeling. The work enables robust cross-modality retinal registration and suggests applicability to other VD-dissimilar modality pairs, supporting broader biomedical image fusion and quantitative microvascular analysis.

Abstract

Paper Structure (37 sections, 9 equations, 9 figures, 8 tables)

This paper contains 37 sections, 9 equations, 9 figures, 8 tables.

Introduction
Related Works
Retinal Image Datasets with Image Pairs
Multi-Modal Retinal Image Registration
The MEMO Dataset
Overview
EMA
OCTA
Dataset Analysis
Proposed Method
Vessel Segmentation Module
LVD-Seg Background
LVD-Seg Stage 1 - Supervised Loss
LVD-Seg Stage 2 - Unsupervised Loss
Registration Module
...and 22 more sections

Figures (9)

Figure 1: Sample images of (a) CF, (b) FA, (c) EMA and (d) OCTA with vessel density (VD). (a) and (b) are taken from the CF-FA dataset hajeb2012diabetic. In this example, the vessel density of OCTA (d) is five times grater than that of EMA (c) since most capillaries cannot be visualized in EMA images.
Figure 2: A typical sample EMA and OCTA pair from our MEMO dataset. Images inside the orange boxes were used for ground truth labeling. (A-1, A-2 and A-3: frame 0, 10 and 20 in the sample EMA image sequence. A-4: the stacked images of the EMA sequence. C-1, C-2 and C-3: the sample OCTA projection images representing DCP, ICP and SVP layer. C-4: the OCTA B-scan image. B and D: the six corresponding point pairs of the sample EMA and OCTA pair.)
Figure 3: The procedure for image acquisition. The numbers shown in the figure indicate the order.
Figure 4: Image samples corresponding to each eye of each NHP. The EMA image is placed on top of the OCTA image for each image pair.
Figure 5: The Statistics of our MEMO dataset. The number of image pairs (count) falling within different ranges of (a) translation in the x-axis (pixel), (b) translation in the y-axis (pixel), or (c) rotation (degree) are presented. The division of training and test data for the MEMO dataset is outlined in Sec. \ref{['ExpDataset:MEMO']}.
...and 4 more figures

MEMO: Dataset and Methods for Robust Multimodal Retinal Image Registration with Large or Small Vessel Density Differences

TL;DR

Abstract

MEMO: Dataset and Methods for Robust Multimodal Retinal Image Registration with Large or Small Vessel Density Differences

Authors

TL;DR

Abstract

Table of Contents

Figures (9)