A Flexible 2.5D Medical Image Segmentation Approach with In-Slice and Cross-Slice Attention
Amarjeet Kumar, Hongxu Jiang, Muhammad Imran, Cyndi Valdes, Gabriela Leon, Dahyun Kang, Parvathi Nataraj, Yuyin Zhou, Michael D. Weiss, Wei Shao
TL;DR
CSA-Net introduces a flexible 2.5D segmentation framework that jointly models inter-slice and intra-slice dependencies through Cross-Slice Attention and In-Slice Attention. The architecture combines twin attention streams with a Vision Transformer encoder and a CNN-based decoder, achieving superior Dice and HD95 on brain and prostate MRI datasets compared to 2D and 2.5D baselines. Key contributions include the CSA/ISA modules, multi-head attention design, and demonstrated generalization across single- and multi-class 2.5D tasks, validated by ablations and qualitative analyses. The work advances practical 2.5D segmentation, offering improved accuracy with lower computational demands and broad clinical applicability for MRI-based neuro- and prostate-imaging workflows.
Abstract
Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, which have high in-plane but low through-plane resolution, is a relatively unexplored challenge. While applying 2D models to individual slices of a 2.5D image is feasible, it fails to capture the spatial relationships between slices. On the other hand, 3D models face challenges such as resolution inconsistencies in 2.5D images, along with computational complexity and susceptibility to overfitting when trained with limited data. In this context, 2.5D models, which capture inter-slice correlations using only 2D neural networks, emerge as a promising solution due to their reduced computational demand and simplicity in implementation. In this paper, we introduce CSA-Net, a flexible 2.5D segmentation model capable of processing 2.5D images with an arbitrary number of slices through an innovative Cross-Slice Attention (CSA) module. This module uses the cross-slice attention mechanism to effectively capture 3D spatial information by learning long-range dependencies between the center slice (for segmentation) and its neighboring slices. Moreover, CSA-Net utilizes the self-attention mechanism to understand correlations among pixels within the center slice. We evaluated CSA-Net on three 2.5D segmentation tasks: (1) multi-class brain MRI segmentation, (2) binary prostate MRI segmentation, and (3) multi-class prostate MRI segmentation. CSA-Net outperformed leading 2D and 2.5D segmentation methods across all three tasks, demonstrating its efficacy and superiority. Our code is publicly available at https://github.com/mirthAI/CSA-Net.
