KLDD: Kalman Filter based Linear Deformable Diffusion Model in Retinal Image Segmentation

Zhihao Zhao; Yinzheng Zhao; Junjie Yang; Kai Huang; Nassir Navab; M. Ali Nasseri

KLDD: Kalman Filter based Linear Deformable Diffusion Model in Retinal Image Segmentation

Zhihao Zhao, Yinzheng Zhao, Junjie Yang, Kai Huang, Nassir Navab, M. Ali Nasseri

TL;DR

A novel Kalman filter based Linear Deformable Diffusion model for retinal vessel segmentation that outperforms other methods on retinal fundus image datasets and the results show that the diffusion model proposed in this paper outperforms other methods.

Abstract

AI-based vascular segmentation is becoming increasingly common in enhancing the screening and treatment of ophthalmic diseases. Deep learning structures based on U-Net have achieved relatively good performance in vascular segmentation. However, small blood vessels and capillaries tend to be lost during segmentation when passed through the traditional U-Net downsampling module. To address this gap, this paper proposes a novel Kalman filter based Linear Deformable Diffusion (KLDD) model for retinal vessel segmentation. Our model employs a diffusion process that iteratively refines the segmentation, leveraging the flexible receptive fields of deformable convolutions in feature extraction modules to adapt to the detailed tubular vascular structures. More specifically, we first employ a feature extractor with linear deformable convolution to capture vascular structure information form the input images. To better optimize the coordinate positions of deformable convolution, we employ the Kalman filter to enhance the perception of vascular structures in linear deformable convolution. Subsequently, the features of the vascular structures extracted are utilized as a conditioning element within a diffusion model by the Cross-Attention Aggregation module (CAAM) and the Channel-wise Soft Attention module (CSAM). These aggregations are designed to enhance the diffusion model's capability to generate vascular structures. Experiments are evaluated on retinal fundus image datasets (DRIVE, CHASE_DB1) as well as the 3mm and 6mm of the OCTA-500 dataset, and the results show that the diffusion model proposed in this paper outperforms other methods.

KLDD: Kalman Filter based Linear Deformable Diffusion Model in Retinal Image Segmentation

TL;DR

Abstract

Paper Structure (18 sections, 8 equations, 4 figures, 4 tables)

This paper contains 18 sections, 8 equations, 4 figures, 4 tables.

Introduction
Related work
Deformable Convolution in Segmentation
Diffusion Model in Segmentation
Methodology
Overview of the Proposed Network
Diffusion Model for Vessel Genetation
Vascular Features Extracted Module
Feature Aggregation Module by Attention
Loss Function
Experiment
Database
Evaluation Metrics
Implementation Details
Results of Segmentation
...and 3 more sections

Figures (4)

Figure 1: Overview of our proposed KLDD model. The main structure of the model is based on diffusion processes, supplemented by an additional feature extractor designed to guide the generation of specific vascular structures. Within this extractor, a linear deformable convolution is applied to gather information specific to the vascular regions. The two feature aggregation modules, referred to as CAAM and CSAM, are mainly employed to aggregate the features extracted from the input image with those from the denoising module of the diffusion model.
Figure 2: Figure (a) displays the feature map within the network. In Figure (b), the gray background illustrates the vascular area, while white circles symbolize standard one-dimensional convolution. Blue circles represent the positions of the field of view following linear deformable convolution, and blue grid circles indicate locations with aberrant offsets. In Figure (c), brown circles denote points that have been optimized using Kalman filter.
Figure 3: The visualization of segmentation results is organized as follows: the first and third rows show the segmentation outcomes of fundus and OCTA, respectively, while the second and fourth rows illustrate enlargements of the cropped regions.
Figure 4: The first and second rows illustrate fundus images, specifically showcasing the convolutional fields of view for horizontal and vertical vascular structures under various deformable convolution settings. Similarly, the third and fourth rows display the visualized fields of view for deformable convolution applied to OCTA images.

KLDD: Kalman Filter based Linear Deformable Diffusion Model in Retinal Image Segmentation

TL;DR

Abstract

KLDD: Kalman Filter based Linear Deformable Diffusion Model in Retinal Image Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)