Table of Contents
Fetching ...

MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation

Zhuangzhuang Chen, Hualiang Wang, Chubin Ou, Xiaomeng Li

TL;DR

This work tackles the challenge of translating 3D OCT images to 3D OCTA images without relying on expensive hardware or costly vascular annotations. It introduces MuTri, a two-stage framework that learns OCT→OCTA translation in a discrete space using vector-quantized representations and multi-view priors from 3D OCT, 3D OCTA, and 2D OCTA projection maps. Stage I pre-trains OCT and OCTA VQ-VAEs to provide semantic priors, while Stage II employs contrastive-inspired semantic alignment and vessel structure alignment to guide the translation, leveraging pre-trained reconstruction models. The authors also release OCTA2024, a large-scale OCT-OCTA dataset, and demonstrate state-of-the-art performance across multiple datasets, highlighting the method’s robustness and potential for low-cost OCTA synthesis in clinical contexts.

Abstract

Optical coherence tomography angiography (OCTA) shows its great importance in imaging microvascular networks by providing accurate 3D imaging of blood vessels, but it relies upon specialized sensors and expensive devices. For this reason, previous works show the potential to translate the readily available 3D Optical Coherence Tomography (OCT) images into 3D OCTA images. However, existing OCTA translation methods directly learn the mapping from the OCT domain to the OCTA domain in continuous and infinite space with guidance from only a single view, i.e., the OCTA project map, resulting in suboptimal results. To this end, we propose the multi-view Tri-alignment framework for OCT to OCTA 3D image translation in discrete and finite space, named MuTri. In the first stage, we pre-train two vector-quantized variational auto-encoder (VQ- VAE) by reconstructing 3D OCT and 3D OCTA data, providing semantic prior for subsequent multi-view guidances. In the second stage, our multi-view tri-alignment facilitates another VQVAE model to learn the mapping from the OCT domain to the OCTA domain in discrete and finite space. Specifically, a contrastive-inspired semantic alignment is proposed to maximize the mutual information with the pre-trained models from OCT and OCTA views, to facilitate codebook learning. Meanwhile, a vessel structure alignment is proposed to minimize the structure discrepancy with the pre-trained models from the OCTA project map view, benefiting from learning the detailed vessel structure information. We also collect the first large-scale dataset, namely, OCTA2024, which contains a pair of OCT and OCTA volumes from 846 subjects.

MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation

TL;DR

This work tackles the challenge of translating 3D OCT images to 3D OCTA images without relying on expensive hardware or costly vascular annotations. It introduces MuTri, a two-stage framework that learns OCT→OCTA translation in a discrete space using vector-quantized representations and multi-view priors from 3D OCT, 3D OCTA, and 2D OCTA projection maps. Stage I pre-trains OCT and OCTA VQ-VAEs to provide semantic priors, while Stage II employs contrastive-inspired semantic alignment and vessel structure alignment to guide the translation, leveraging pre-trained reconstruction models. The authors also release OCTA2024, a large-scale OCT-OCTA dataset, and demonstrate state-of-the-art performance across multiple datasets, highlighting the method’s robustness and potential for low-cost OCTA synthesis in clinical contexts.

Abstract

Optical coherence tomography angiography (OCTA) shows its great importance in imaging microvascular networks by providing accurate 3D imaging of blood vessels, but it relies upon specialized sensors and expensive devices. For this reason, previous works show the potential to translate the readily available 3D Optical Coherence Tomography (OCT) images into 3D OCTA images. However, existing OCTA translation methods directly learn the mapping from the OCT domain to the OCTA domain in continuous and infinite space with guidance from only a single view, i.e., the OCTA project map, resulting in suboptimal results. To this end, we propose the multi-view Tri-alignment framework for OCT to OCTA 3D image translation in discrete and finite space, named MuTri. In the first stage, we pre-train two vector-quantized variational auto-encoder (VQ- VAE) by reconstructing 3D OCT and 3D OCTA data, providing semantic prior for subsequent multi-view guidances. In the second stage, our multi-view tri-alignment facilitates another VQVAE model to learn the mapping from the OCT domain to the OCTA domain in discrete and finite space. Specifically, a contrastive-inspired semantic alignment is proposed to maximize the mutual information with the pre-trained models from OCT and OCTA views, to facilitate codebook learning. Meanwhile, a vessel structure alignment is proposed to minimize the structure discrepancy with the pre-trained models from the OCTA project map view, benefiting from learning the detailed vessel structure information. We also collect the first large-scale dataset, namely, OCTA2024, which contains a pair of OCT and OCTA volumes from 846 subjects.

Paper Structure

This paper contains 16 sections, 11 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: (a) The existing OCTA translation methods learn the mapping in the infinite and continuous space: (a) 2D OCT projection map (PM) or B-scan images as input, (b) 3D OCT volume as input with only single view guidance from 2D OCT PM. (c) A vanilla VQVAE involves a lower codebook utilization rate for OCT to OCTA 3D image translation tasks, resulting in a less informative codebook. Note that, the codebook utilization rate is computed by using training samples from three datasets.
  • Figure 2: The overall pipeline of MuTri. It consists of two stages to facilitate OCT to OCTA 3D image translation. (a) Stage 1 employs two VQVAE pre-trained on the OCT and OCTA volumes, to provide multi-view guidances: 3D OCT, 3D OCTA, and 2D OCTA projection map. (b) Stage 2 utilizes another VQVAE that takes OCT volumes as input, to reconstruct the OCTA volumes under our contrastive-inspired semantic alignment from 3D OCT and OCTA views and vessel structure alignment from 2D OCTA projection map view.
  • Figure 3: Visualizations of real and fake OCTA project map (PM).
  • Figure 4: Hyper-parameter sensitivity study on OCTA-3M dataset.
  • Figure 5: The decreased capillary density diseased patterns of translated OCTA projection maps (PM) annotated by an experienced ophthalmologist. The diseased pattern is presented alongside the real OCTA image, the translation results from TransPro, and our MuTri. Levels of disease are annotated from ‘‘+’’ to ‘‘+++’’, where the number of ‘‘+’’ denotes the degree of severity.