Table of Contents
Fetching ...

DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation

Adnan Munir, Shujaat Khan

TL;DR

DAUNet addresses the demand for accurate and efficient medical image segmentation by integrating a deformable V2 convolution bottleneck with a parameter-free SimAM attention mechanism within a UNet framework. The architecture achieves state-of-the-art performance on ultrasound (FH-PS-AoP) and CT (FUMPE) tasks while maintaining a lightweight parameter count, and ablation confirms the complementary contributions of deformable sampling and SimAM-driven refinement. Robustness to missing context and sharp boundary delineation highlight its suitability for real-time, resource-constrained clinical environments. The results suggest strong potential for deployment in edge devices and prompt future work toward multi-modal and 3D extensions with domain adaptation.

Abstract

Medical image segmentation plays a pivotal role in automated diagnostic and treatment planning systems. In this work, we present DAUNet, a novel lightweight UNet variant that integrates Deformable V2 Convolutions and Parameter-Free Attention (SimAM) to improve spatial adaptability and context-aware feature fusion without increasing model complexity. DAUNet's bottleneck employs dynamic deformable kernels to handle geometric variations, while the decoder and skip pathways are enhanced using SimAM attention modules for saliency-aware refinement. Extensive evaluations on two challenging datasets, FH-PS-AoP (fetal head and pubic symphysis ultrasound) and FUMPE (CT-based pulmonary embolism detection), demonstrate that DAUNet outperforms state-of-the-art models in Dice score, HD95, and ASD, while maintaining superior parameter efficiency. Ablation studies highlight the individual contributions of deformable convolutions and SimAM attention. DAUNet's robustness to missing context and low-contrast regions establishes its suitability for deployment in real-time and resource-constrained clinical environments.

DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation

TL;DR

DAUNet addresses the demand for accurate and efficient medical image segmentation by integrating a deformable V2 convolution bottleneck with a parameter-free SimAM attention mechanism within a UNet framework. The architecture achieves state-of-the-art performance on ultrasound (FH-PS-AoP) and CT (FUMPE) tasks while maintaining a lightweight parameter count, and ablation confirms the complementary contributions of deformable sampling and SimAM-driven refinement. Robustness to missing context and sharp boundary delineation highlight its suitability for real-time, resource-constrained clinical environments. The results suggest strong potential for deployment in edge devices and prompt future work toward multi-modal and 3D extensions with domain adaptation.

Abstract

Medical image segmentation plays a pivotal role in automated diagnostic and treatment planning systems. In this work, we present DAUNet, a novel lightweight UNet variant that integrates Deformable V2 Convolutions and Parameter-Free Attention (SimAM) to improve spatial adaptability and context-aware feature fusion without increasing model complexity. DAUNet's bottleneck employs dynamic deformable kernels to handle geometric variations, while the decoder and skip pathways are enhanced using SimAM attention modules for saliency-aware refinement. Extensive evaluations on two challenging datasets, FH-PS-AoP (fetal head and pubic symphysis ultrasound) and FUMPE (CT-based pulmonary embolism detection), demonstrate that DAUNet outperforms state-of-the-art models in Dice score, HD95, and ASD, while maintaining superior parameter efficiency. Ablation studies highlight the individual contributions of deformable convolutions and SimAM attention. DAUNet's robustness to missing context and low-contrast regions establishes its suitability for deployment in real-time and resource-constrained clinical environments.

Paper Structure

This paper contains 27 sections, 8 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Comparison between standard and deformable convolutions. (a) Standard convolution with fixed sampling grid. (b) Deformable convolution with learnable offsets. The offsets are dynamically adjusted based on input features, allowing the model to better capture spatial variations.
  • Figure 2: Schematic illustration of the SimAM attention mechanism. For each neuron, an energy-based evaluation is performed using surrounding spatial context, followed by an attention weighting operation. The resulting map highlights informative regions without introducing learnable parameters.
  • Figure 3: Overview of the proposed DAUNet architecture. The network is a lightweight variant of the UNet, incorporating key modifications in the bottleneck and skip connection paths. The bottleneck block is redesigned with a sequence of three operations: a $1 \times 1$ convolution for channel compression, a $3 \times 3$ Deformable Convolution V2 for adaptive spatial modeling, followed by another $1 \times 1$ convolution and a SimAM attention block. Additionally, the skip connections are augmented with SimAM modules to enhance feature fusion and emphasize informative activations without increasing parameter count.
  • Figure 4: Segmentation results of different models on FH-PS-AoP dataset: (a) Input image, (b), The ground truth mask, (c) UNet, (d) SCUNet++, (e) TransAttUNet (f) FAT-Net , (g) DAUNet (Proposed). The contours around the prediction masks are the ground truth mask contours.
  • Figure 5: Segmentation results of different models on FUMPE dataset: (a) Input image, (b), The ground truth mask, (c) UNet, (d) SCUNet++, (e) TransAttUNet (f) FAT-Net , (g) DAUNet (Proposed).
  • ...and 2 more figures