Table of Contents
Fetching ...

KaLDeX: Kalman Filter based Linear Deformable Cross Attention for Retina Vessel Segmentation

Zhihao Zhao, Shahrooz Faghihroohi, Yinzheng Zhao, Junjie Yang, Shipeng Zhong, Kai Huang, Nassir Navab, Boyang Li, M. Ali Nasseri

TL;DR

Empirical evidence shows that the proposed novel network for vascular segmentation leveraging a Kalman filter based linear deformable cross attention module integrated within a UNet++ framework outperforms the current best models on different vessel segmentation datasets.

Abstract

Background and Objective: In the realm of ophthalmic imaging, accurate vascular segmentation is paramount for diagnosing and managing various eye diseases. Contemporary deep learning-based vascular segmentation models rival human accuracy but still face substantial challenges in accurately segmenting minuscule blood vessels in neural network applications. Due to the necessity of multiple downsampling operations in the CNN models, fine details from high-resolution images are inevitably lost. The objective of this study is to design a structure to capture the delicate and small blood vessels. Methods: To address these issues, we propose a novel network (KaLDeX) for vascular segmentation leveraging a Kalman filter based linear deformable cross attention (LDCA) module, integrated within a UNet++ framework. Our approach is based on two key components: Kalman filter (KF) based linear deformable convolution (LD) and cross-attention (CA) modules. The LD module is designed to adaptively adjust the focus on thin vessels that might be overlooked in standard convolution. The CA module improves the global understanding of vascular structures by aggregating the detailed features from the LD module with the high level features from the UNet++ architecture. Finally, we adopt a topological loss function based on persistent homology to constrain the topological continuity of the segmentation. Results: The proposed method is evaluated on retinal fundus image datasets (DRIVE, CHASE_BD1, and STARE) as well as the 3mm and 6mm of the OCTA-500 dataset, achieving an average accuracy (ACC) of 97.25%, 97.77%, 97.85%, 98.89%, and 98.21%, respectively. Conclusions: Empirical evidence shows that our method outperforms the current best models on different vessel segmentation datasets. Our source code is available at: https://github.com/AIEyeSystem/KalDeX.

KaLDeX: Kalman Filter based Linear Deformable Cross Attention for Retina Vessel Segmentation

TL;DR

Empirical evidence shows that the proposed novel network for vascular segmentation leveraging a Kalman filter based linear deformable cross attention module integrated within a UNet++ framework outperforms the current best models on different vessel segmentation datasets.

Abstract

Background and Objective: In the realm of ophthalmic imaging, accurate vascular segmentation is paramount for diagnosing and managing various eye diseases. Contemporary deep learning-based vascular segmentation models rival human accuracy but still face substantial challenges in accurately segmenting minuscule blood vessels in neural network applications. Due to the necessity of multiple downsampling operations in the CNN models, fine details from high-resolution images are inevitably lost. The objective of this study is to design a structure to capture the delicate and small blood vessels. Methods: To address these issues, we propose a novel network (KaLDeX) for vascular segmentation leveraging a Kalman filter based linear deformable cross attention (LDCA) module, integrated within a UNet++ framework. Our approach is based on two key components: Kalman filter (KF) based linear deformable convolution (LD) and cross-attention (CA) modules. The LD module is designed to adaptively adjust the focus on thin vessels that might be overlooked in standard convolution. The CA module improves the global understanding of vascular structures by aggregating the detailed features from the LD module with the high level features from the UNet++ architecture. Finally, we adopt a topological loss function based on persistent homology to constrain the topological continuity of the segmentation. Results: The proposed method is evaluated on retinal fundus image datasets (DRIVE, CHASE_BD1, and STARE) as well as the 3mm and 6mm of the OCTA-500 dataset, achieving an average accuracy (ACC) of 97.25%, 97.77%, 97.85%, 98.89%, and 98.21%, respectively. Conclusions: Empirical evidence shows that our method outperforms the current best models on different vessel segmentation datasets. Our source code is available at: https://github.com/AIEyeSystem/KalDeX.

Paper Structure

This paper contains 27 sections, 7 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: The first row shows the ophthalmic images and their segmentation masks. The second row represents the ROI (region of interest) areas from the first row (in green). The ROI indicates the thin vascular structures that are difficult to recognize.
  • Figure 2: Overview of our proposed KaLDeX. (a) is the overall framework of our proposed structure. (b) showcases the key module (LDCA), which primarily consists of two sub-modules: LD($b_1$) and CA($b_2$). LD is primarily focused on capturing vascular structures through deformable convolutions and optimizing the coordinates of deformable convolutions using KF. CA aggregates the output feature maps of LD with the feature maps obtained from up-sampling and down-sampling in the network.
  • Figure 3: LD is primarily focused on capturing vascular structures through LD module and optimizing the coordinates of convolutions using KF.
  • Figure 4: In figure (a), the gray background illustrates the vascular area, while white circles symbolize standard one-dimensional convolution. Green circles represent the positions of the field of view following linear deformable convolution In figure (b), brown circles denote points that have been optimized using Kalman filter.
  • Figure 5: CA aggregates the output feature maps of LD with the feature maps obtained from up-sampling and down-sampling in the network.
  • ...and 3 more figures