Table of Contents
Fetching ...

GCA-ResUNet:Image segmentation in medical images using grouped coordinate attention

Jun Ding, Shang Gao

TL;DR

This work tackles the challenge of achieving accurate medical image segmentation without prohibitive computation by introducing GCA-ResUNet, a CNN–Transformer hybrid that preserves CNN efficiency while incorporating global context through Grouped Coordinate Attention. The backbone uses a ResNet-50–based U-Net architecture with GCA modules embedded in residual blocks, enabling joint channel and directional spatial modeling with minimal overhead. Experiments on Synapse and ACDC demonstrate competitive Dice scores ($DSC$ of $86.11\%$ and $92.64\%$, respectively) and improved boundary delineation relative to strong baselines, highlighting robustness on complex anatomy and blurred boundaries. The results suggest that GCA provides a practical path to endow convolutional architectures with global modeling capabilities suitable for resource-constrained clinical deployments, with potential extensions to multi-task and 3D segmentation.

Abstract

Medical image segmentation underpins computer-aided diagnosis and therapy by supporting clinical diagnosis, preoperative planning, and disease monitoring. While U-Net style convolutional neural networks perform well due to their encoder-decoder structures with skip connections, they struggle to capture long-range dependencies. Transformer-based variants address global context but often require heavy computation and large training datasets. This paper proposes GCA-ResUNet, an efficient segmentation network that integrates Grouped Coordinate Attention (GCA) into ResNet-50 residual blocks. GCA uses grouped coordinate modeling to jointly encode global dependencies across channels and spatial locations, strengthening feature representation and boundary delineation while adding minimal parameter and FLOP overhead compared with self-attention. On the Synapse dataset, GCA-ResUNet achieves a Dice score of 86.11%, and on the ACDC dataset, it reaches 92.64%, surpassing several state-of-the-art baselines while maintaining fast inference and favorable computational efficiency. These results indicate that GCA offers a practical way to enhance convolutional architectures with global modeling capability, enabling high-accuracy and resource-efficient medical image segmentation.

GCA-ResUNet:Image segmentation in medical images using grouped coordinate attention

TL;DR

This work tackles the challenge of achieving accurate medical image segmentation without prohibitive computation by introducing GCA-ResUNet, a CNN–Transformer hybrid that preserves CNN efficiency while incorporating global context through Grouped Coordinate Attention. The backbone uses a ResNet-50–based U-Net architecture with GCA modules embedded in residual blocks, enabling joint channel and directional spatial modeling with minimal overhead. Experiments on Synapse and ACDC demonstrate competitive Dice scores ( of and , respectively) and improved boundary delineation relative to strong baselines, highlighting robustness on complex anatomy and blurred boundaries. The results suggest that GCA provides a practical path to endow convolutional architectures with global modeling capabilities suitable for resource-constrained clinical deployments, with potential extensions to multi-task and 3D segmentation.

Abstract

Medical image segmentation underpins computer-aided diagnosis and therapy by supporting clinical diagnosis, preoperative planning, and disease monitoring. While U-Net style convolutional neural networks perform well due to their encoder-decoder structures with skip connections, they struggle to capture long-range dependencies. Transformer-based variants address global context but often require heavy computation and large training datasets. This paper proposes GCA-ResUNet, an efficient segmentation network that integrates Grouped Coordinate Attention (GCA) into ResNet-50 residual blocks. GCA uses grouped coordinate modeling to jointly encode global dependencies across channels and spatial locations, strengthening feature representation and boundary delineation while adding minimal parameter and FLOP overhead compared with self-attention. On the Synapse dataset, GCA-ResUNet achieves a Dice score of 86.11%, and on the ACDC dataset, it reaches 92.64%, surpassing several state-of-the-art baselines while maintaining fast inference and favorable computational efficiency. These results indicate that GCA offers a practical way to enhance convolutional architectures with global modeling capability, enabling high-accuracy and resource-efficient medical image segmentation.

Paper Structure

This paper contains 15 sections, 10 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Schematic diagram of the GCA-ResUNet network. The architecture adopts a U-Net–style encoder–decoder structure, with residual blocks incorporated in the encoder and skip connections to enable multi-scale feature extraction and precise pixel-level segmentation.
  • Figure 2: Grouped Coordinate Attention (GCA) network model diagram
  • Figure 3: Comparison of segmentation performance on the Synapse dataset
  • Figure 4: Comparison of segmentation performance on the ACDC dataset
  • Figure 5: Performance Comparison of Original and Modified ResNet-UNet with Different Modules on Synapse Dataset
  • ...and 1 more figures