Table of Contents
Fetching ...

Snake with Shifted Window: Learning to Adapt Vessel Pattern for OCTA Segmentation

Xinrun Chen, Mei Shen, Haojian Ning, Mengzhan Zhang, Chengliang Wang, Shiying Li

TL;DR

This work addresses the challenge of segmenting retinal vessels in OCTA en-face images by capturing both tubular vessel morphology and global vascular distribution. It proposes SSW-OCTA, a lightweight U-Net–style model that combines dynamic snake convolution for local deformations with a shifted window transformer for global context. The authors demonstrate state-of-the-art or competitive performance on the OCTA-500 dataset across RV, FAZ, capillary, artery, and vein tasks, and offer ablations to guide architecture and hyperparameter choices, highlighting a favorable trade-off between accuracy and efficiency (~170k parameters). The approach has practical impact for OCTA-based retinal disease analysis and lays groundwork for robust, deployable OCTA segmentation tools.

Abstract

Segmenting specific targets or structures in optical coherence tomography angiography (OCTA) images is fundamental for conducting further pathological studies. The retinal vascular layers are rich and intricate, and such vascular with complex shapes can be captured by the widely-studied OCTA images. In this paper, we thus study how to use OCTA images with projection vascular layers to segment retinal structures. To this end, we propose the SSW-OCTA model, which integrates the advantages of deformable convolutions suited for tubular structures and the swin-transformer for global feature extraction, adapting to the characteristics of OCTA modality images. Our model underwent testing and comparison on the OCTA-500 dataset, achieving state-of-the-art performance. The code is available at: https://github.com/ShellRedia/Snake-SWin-OCTA.

Snake with Shifted Window: Learning to Adapt Vessel Pattern for OCTA Segmentation

TL;DR

This work addresses the challenge of segmenting retinal vessels in OCTA en-face images by capturing both tubular vessel morphology and global vascular distribution. It proposes SSW-OCTA, a lightweight U-Net–style model that combines dynamic snake convolution for local deformations with a shifted window transformer for global context. The authors demonstrate state-of-the-art or competitive performance on the OCTA-500 dataset across RV, FAZ, capillary, artery, and vein tasks, and offer ablations to guide architecture and hyperparameter choices, highlighting a favorable trade-off between accuracy and efficiency (~170k parameters). The approach has practical impact for OCTA-based retinal disease analysis and lays groundwork for robust, deployable OCTA segmentation tools.

Abstract

Segmenting specific targets or structures in optical coherence tomography angiography (OCTA) images is fundamental for conducting further pathological studies. The retinal vascular layers are rich and intricate, and such vascular with complex shapes can be captured by the widely-studied OCTA images. In this paper, we thus study how to use OCTA images with projection vascular layers to segment retinal structures. To this end, we propose the SSW-OCTA model, which integrates the advantages of deformable convolutions suited for tubular structures and the swin-transformer for global feature extraction, adapting to the characteristics of OCTA modality images. Our model underwent testing and comparison on the OCTA-500 dataset, achieving state-of-the-art performance. The code is available at: https://github.com/ShellRedia/Snake-SWin-OCTA.
Paper Structure (13 sections, 5 equations, 7 figures, 4 tables)

This paper contains 13 sections, 5 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: OCTA scanned 3D volume in various views: (a) perspective; (b) en-face; (c) B-scan.
  • Figure 2: Schematic diagram of the SSW-OCTA model.
  • Figure 3: Deformation mechanism of dynamic snake convolution layer.
  • Figure 4: Patch merging and (S)W-MSA. The process of SW-MSA and W-MSA is similar. Both modules involve dividing the feature map into patches, placing them into different windows, computing masked multi-head self-attention within each window, and then reforming from patches. Compared to W-MSA, SW-MSA imports the "cycle shift" step, enhancing the feature interaction between far-away patches.
  • Figure 5: Two model architecture types are designed to explore better feature extraction strategy for SSW-OCTA.
  • ...and 2 more figures