Table of Contents
Fetching ...

Bridging spatial awareness and global context in medical image segmentation

Dalia Alzu'bi, A. Ben Hamza

TL;DR

The paper introduces U-CycleMLP, a lightweight encoder–decoder for 2D medical image segmentation that bridges local detail and global context. It fuses PAWE and dense atrous blocks in the encoder with Channel CycleMLP-based refined skip connections and a CycleMLP-enabled decoder, achieving linear computational complexity relative to input size. Across ISIC, BUSI, and ACDC, the approach delivers superior segmentation accuracy and robust boundary delineation while maintaining favorable efficiency. Ablation studies confirm the contribution of CCM and downsampling strategies, and experiments demonstrate consistent improvements over state-of-the-art methods. These results suggest strong potential for practical, multi-modality clinical segmentation tasks with scalable computation.

Abstract

Medical image segmentation is a fundamental task in computer-aided diagnosis, requiring models that balance segmentation accuracy and computational efficiency. However, existing segmentation models often struggle to effectively capture local and global contextual information, leading to boundary pixel loss and segmentation errors. In this paper, we propose U-CycleMLP, a novel U-shaped encoder-decoder network designed to enhance segmentation performance while maintaining a lightweight architecture. The encoder learns multiscale contextual features using position attention weight excitation blocks, dense atrous blocks, and downsampling operations, effectively capturing both local and global contextual information. The decoder reconstructs high-resolution segmentation masks through upsampling operations, dense atrous blocks, and feature fusion mechanisms, ensuring precise boundary delineation. To further refine segmentation predictions, channel CycleMLP blocks are incorporated into the decoder along the skip connections, enhancing feature integration while maintaining linear computational complexity relative to input size. Experimental results, both quantitative and qualitative, across three benchmark datasets demonstrate the competitive performance of U-CycleMLP in comparison with state-of-the-art methods, achieving better segmentation accuracy across all datasets, capturing fine-grained anatomical structures, and demonstrating robustness across different medical imaging modalities. Ablation studies further highlight the importance of the model's core architectural components in enhancing segmentation accuracy.

Bridging spatial awareness and global context in medical image segmentation

TL;DR

The paper introduces U-CycleMLP, a lightweight encoder–decoder for 2D medical image segmentation that bridges local detail and global context. It fuses PAWE and dense atrous blocks in the encoder with Channel CycleMLP-based refined skip connections and a CycleMLP-enabled decoder, achieving linear computational complexity relative to input size. Across ISIC, BUSI, and ACDC, the approach delivers superior segmentation accuracy and robust boundary delineation while maintaining favorable efficiency. Ablation studies confirm the contribution of CCM and downsampling strategies, and experiments demonstrate consistent improvements over state-of-the-art methods. These results suggest strong potential for practical, multi-modality clinical segmentation tasks with scalable computation.

Abstract

Medical image segmentation is a fundamental task in computer-aided diagnosis, requiring models that balance segmentation accuracy and computational efficiency. However, existing segmentation models often struggle to effectively capture local and global contextual information, leading to boundary pixel loss and segmentation errors. In this paper, we propose U-CycleMLP, a novel U-shaped encoder-decoder network designed to enhance segmentation performance while maintaining a lightweight architecture. The encoder learns multiscale contextual features using position attention weight excitation blocks, dense atrous blocks, and downsampling operations, effectively capturing both local and global contextual information. The decoder reconstructs high-resolution segmentation masks through upsampling operations, dense atrous blocks, and feature fusion mechanisms, ensuring precise boundary delineation. To further refine segmentation predictions, channel CycleMLP blocks are incorporated into the decoder along the skip connections, enhancing feature integration while maintaining linear computational complexity relative to input size. Experimental results, both quantitative and qualitative, across three benchmark datasets demonstrate the competitive performance of U-CycleMLP in comparison with state-of-the-art methods, achieving better segmentation accuracy across all datasets, capturing fine-grained anatomical structures, and demonstrating robustness across different medical imaging modalities. Ablation studies further highlight the importance of the model's core architectural components in enhancing segmentation accuracy.

Paper Structure

This paper contains 13 sections, 17 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Network architecture of the proposed U-CycleMLP framework for medical image segmentation. The model follows a U-shaped encoder-decoder network with five skip connections, each bridging corresponding encoder and decoder layers to facilitate multiscale feature integration, and enhancing spatial resolution. The encoder employs a position attention weight excitation (PAWE) block repeated twice, and dense atrous (DA) blocks along with downsampling operations. The decoder incorporates upsampling operations, dense atrous blocks, and feature fusion mechanisms by leveraging channel CycleMLP blocks for refined segmentation predictions while maintaining linear computational complexity relative to input size.
  • Figure 2: Qualitative Comparisons of Segmentation Results on the ISIC dataset. Visualizations highlight segmentation errors. White, green, and red regions indicate predicted segmentation, under-segmentation, and over-segmentation, respectively. Our U-CycleMLP model has the strongest segmentation ability for irregular images on the edges compared to the baselines.
  • Figure 3: Qualitative Comparisons of Segmentation Heatmap Results on BUSI dataset. Our U-CycleMLP model yields visualization results for breast tumor segmentation compared to baselines.
  • Figure 4: Qualitative Comparisons of Segmentation Results on ACDC dataset. Our U-CycleMLP model yields best visualization results for cardiac structures compared to baselines.