Table of Contents
Fetching ...

CUNSB-RFIE: Context-aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement

Xuanzhao Dong, Vamsi Krishna Vasa, Wenhui Zhu, Peijie Qiu, Xiwen Chen, Yi Su, Yujian Xiong, Zhangsihao Yang, Yanxi Chen, Yalin Wang

TL;DR

This work uses the Schrödinger Bridge framework to propose an image-to-image translation pipeline for retinal image enhancement and names the resulting retinal fundus image enhancement framework the Context-aware Unpaired Neural Schrödinger Bridge (CUNSB-RFIE).

Abstract

Retinal fundus photography is significant in diagnosing and monitoring retinal diseases. However, systemic imperfections and operator/patient-related factors can hinder the acquisition of high-quality retinal images. Previous efforts in retinal image enhancement primarily relied on GANs, which are limited by the trade-off between training stability and output diversity. In contrast, the Schrödinger Bridge (SB), offers a more stable solution by utilizing Optimal Transport (OT) theory to model a stochastic differential equation (SDE) between two arbitrary distributions. This allows SB to effectively transform low-quality retinal images into their high-quality counterparts. In this work, we leverage the SB framework to propose an image-to-image translation pipeline for retinal image enhancement. Additionally, previous methods often fail to capture fine structural details, such as blood vessels. To address this, we enhance our pipeline by introducing Dynamic Snake Convolution, whose tortuous receptive field can better preserve tubular structures. We name the resulting retinal fundus image enhancement framework the Context-aware Unpaired Neural Schrödinger Bridge (CUNSB-RFIE). To the best of our knowledge, this is the first endeavor to use the SB approach for retinal image enhancement. Experimental results on a large-scale dataset demonstrate the advantage of the proposed method compared to several state-of-the-art supervised and unsupervised methods in terms of image quality and performance on downstream tasks.The code is available at https://github.com/Retinal-Research/CUNSB-RFIE .

CUNSB-RFIE: Context-aware Unpaired Neural Schrödinger Bridge in Retinal Fundus Image Enhancement

TL;DR

This work uses the Schrödinger Bridge framework to propose an image-to-image translation pipeline for retinal image enhancement and names the resulting retinal fundus image enhancement framework the Context-aware Unpaired Neural Schrödinger Bridge (CUNSB-RFIE).

Abstract

Retinal fundus photography is significant in diagnosing and monitoring retinal diseases. However, systemic imperfections and operator/patient-related factors can hinder the acquisition of high-quality retinal images. Previous efforts in retinal image enhancement primarily relied on GANs, which are limited by the trade-off between training stability and output diversity. In contrast, the Schrödinger Bridge (SB), offers a more stable solution by utilizing Optimal Transport (OT) theory to model a stochastic differential equation (SDE) between two arbitrary distributions. This allows SB to effectively transform low-quality retinal images into their high-quality counterparts. In this work, we leverage the SB framework to propose an image-to-image translation pipeline for retinal image enhancement. Additionally, previous methods often fail to capture fine structural details, such as blood vessels. To address this, we enhance our pipeline by introducing Dynamic Snake Convolution, whose tortuous receptive field can better preserve tubular structures. We name the resulting retinal fundus image enhancement framework the Context-aware Unpaired Neural Schrödinger Bridge (CUNSB-RFIE). To the best of our knowledge, this is the first endeavor to use the SB approach for retinal image enhancement. Experimental results on a large-scale dataset demonstrate the advantage of the proposed method compared to several state-of-the-art supervised and unsupervised methods in terms of image quality and performance on downstream tasks.The code is available at https://github.com/Retinal-Research/CUNSB-RFIE .
Paper Structure (12 sections, 14 equations, 7 figures, 3 tables)

This paper contains 12 sections, 14 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Three generative model pipelines in retinal image enhancement: GANs face the challenge of balancing training stability with generation diversity and quality. Diffusion models require paired data from the source domain as a condition and are limited by the prior Gaussian assumption and longer inference times. SB does not suffer from these limitations.
  • Figure 2: Illustration of our CUNSB-RFIE framework. (a) The training pipeline. (b) The U-Net-like generator structure. (c) The Encoder and Decoder block structure in the generator with Dynamic Snake Convolution. (d) The structure of the Bottleneck block with Gaussian noise and time information embedding.
  • Figure 3: Left. The DSC coordinate grids along the column Right. The DSC coordinate grids along the row
  • Figure 4: Qualitative illustration of generation results over different noise (i.e., illumination, spot artifacts, and blurring). Our pipeline achieves good performance even on images with mixed noise (col. 4, 5, and 6).
  • Figure 5: Visual comparison of our method with baselines across three datasets. Col 1 represents the high-quality ground truth, Col 2 represents the degraded images, Col 10 shows our results, and the columns in between show the enhanced results from other models.
  • ...and 2 more figures