Table of Contents
Fetching ...

A Flexible 2.5D Medical Image Segmentation Approach with In-Slice and Cross-Slice Attention

Amarjeet Kumar, Hongxu Jiang, Muhammad Imran, Cyndi Valdes, Gabriela Leon, Dahyun Kang, Parvathi Nataraj, Yuyin Zhou, Michael D. Weiss, Wei Shao

TL;DR

CSA-Net introduces a flexible 2.5D segmentation framework that jointly models inter-slice and intra-slice dependencies through Cross-Slice Attention and In-Slice Attention. The architecture combines twin attention streams with a Vision Transformer encoder and a CNN-based decoder, achieving superior Dice and HD95 on brain and prostate MRI datasets compared to 2D and 2.5D baselines. Key contributions include the CSA/ISA modules, multi-head attention design, and demonstrated generalization across single- and multi-class 2.5D tasks, validated by ablations and qualitative analyses. The work advances practical 2.5D segmentation, offering improved accuracy with lower computational demands and broad clinical applicability for MRI-based neuro- and prostate-imaging workflows.

Abstract

Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, which have high in-plane but low through-plane resolution, is a relatively unexplored challenge. While applying 2D models to individual slices of a 2.5D image is feasible, it fails to capture the spatial relationships between slices. On the other hand, 3D models face challenges such as resolution inconsistencies in 2.5D images, along with computational complexity and susceptibility to overfitting when trained with limited data. In this context, 2.5D models, which capture inter-slice correlations using only 2D neural networks, emerge as a promising solution due to their reduced computational demand and simplicity in implementation. In this paper, we introduce CSA-Net, a flexible 2.5D segmentation model capable of processing 2.5D images with an arbitrary number of slices through an innovative Cross-Slice Attention (CSA) module. This module uses the cross-slice attention mechanism to effectively capture 3D spatial information by learning long-range dependencies between the center slice (for segmentation) and its neighboring slices. Moreover, CSA-Net utilizes the self-attention mechanism to understand correlations among pixels within the center slice. We evaluated CSA-Net on three 2.5D segmentation tasks: (1) multi-class brain MRI segmentation, (2) binary prostate MRI segmentation, and (3) multi-class prostate MRI segmentation. CSA-Net outperformed leading 2D and 2.5D segmentation methods across all three tasks, demonstrating its efficacy and superiority. Our code is publicly available at https://github.com/mirthAI/CSA-Net.

A Flexible 2.5D Medical Image Segmentation Approach with In-Slice and Cross-Slice Attention

TL;DR

CSA-Net introduces a flexible 2.5D segmentation framework that jointly models inter-slice and intra-slice dependencies through Cross-Slice Attention and In-Slice Attention. The architecture combines twin attention streams with a Vision Transformer encoder and a CNN-based decoder, achieving superior Dice and HD95 on brain and prostate MRI datasets compared to 2D and 2.5D baselines. Key contributions include the CSA/ISA modules, multi-head attention design, and demonstrated generalization across single- and multi-class 2.5D tasks, validated by ablations and qualitative analyses. The work advances practical 2.5D segmentation, offering improved accuracy with lower computational demands and broad clinical applicability for MRI-based neuro- and prostate-imaging workflows.

Abstract

Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, which have high in-plane but low through-plane resolution, is a relatively unexplored challenge. While applying 2D models to individual slices of a 2.5D image is feasible, it fails to capture the spatial relationships between slices. On the other hand, 3D models face challenges such as resolution inconsistencies in 2.5D images, along with computational complexity and susceptibility to overfitting when trained with limited data. In this context, 2.5D models, which capture inter-slice correlations using only 2D neural networks, emerge as a promising solution due to their reduced computational demand and simplicity in implementation. In this paper, we introduce CSA-Net, a flexible 2.5D segmentation model capable of processing 2.5D images with an arbitrary number of slices through an innovative Cross-Slice Attention (CSA) module. This module uses the cross-slice attention mechanism to effectively capture 3D spatial information by learning long-range dependencies between the center slice (for segmentation) and its neighboring slices. Moreover, CSA-Net utilizes the self-attention mechanism to understand correlations among pixels within the center slice. We evaluated CSA-Net on three 2.5D segmentation tasks: (1) multi-class brain MRI segmentation, (2) binary prostate MRI segmentation, and (3) multi-class prostate MRI segmentation. CSA-Net outperformed leading 2D and 2.5D segmentation methods across all three tasks, demonstrating its efficacy and superiority. Our code is publicly available at https://github.com/mirthAI/CSA-Net.
Paper Structure (40 sections, 5 equations, 4 figures, 6 tables)

This paper contains 40 sections, 5 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Illustration of the inputs to 2D, 2.5D, and 3D segmentation models.
  • Figure 2: Overview of CSA-Net's architecture.
  • Figure 3: Overview of our Cross-Slice attention (left) and In-Slice attention (right) architecture.
  • Figure 4: Segmentation results of 2.5D models on a representative subject from each of the three datasets. First row: green is the brain volume and yellow is the ventricles. Second row: green is the prostate capsule. Third row: yellow is the transition zone, green is the peripheral zone, orange is the urethra, and blue is the anterior fibromuscular stroma. Note, for the ProstateX dataset, the prostate is relatively small in the MRI; thus, we only showed a central region for better visualization of the results.