Table of Contents
Fetching ...

Med-2D SegNet: A Light Weight Deep Neural Network for Medical 2D Image Segmentation

Lameya Sabrin, Md. Sanaullah Chowdhury, Salauddin Tapu, Noyon Kumar Sarkar, Ferdous Bin Ali

TL;DR

Med-2D SegNet introduces a lightweight encoder-decoder architecture built around the Med Block to achieve high-accuracy medical image segmentation with a small parameter footprint. The Med Block expands feature representations, applies depthwise spatial aggregation, and reduces channels to maintain efficiency, all connected via residual paths in a structured encoder–decoder. Across 20 diverse datasets, the model attains competitive Dice scores with only 2.07M parameters, demonstrating strong cross-domain performance and zero-shot capabilities, particularly in polyp segmentation. This work paves the way for deployable, high-performance segmentation tools in clinical environments and resource-limited settings, balancing accuracy, efficiency, and generalization.

Abstract

Accurate and efficient medical image segmentation is crucial for advancing clinical diagnostics and surgical planning, yet remains a complex challenge due to the variability in anatomical structures and the demand for low-complexity models. In this paper, we introduced Med-2D SegNet, a novel and highly efficient segmentation architecture that delivers outstanding accuracy while maintaining a minimal computational footprint. Med-2D SegNet achieves state-of-the-art performance across multiple benchmark datasets, including KVASIR-SEG, PH2, EndoVis, and GLAS, with an average Dice similarity coefficient (DSC) of 89.77% across 20 diverse datasets. Central to its success is the compact Med Block, a specialized encoder design that incorporates dimension expansion and parameter reduction, enabling precise feature extraction while keeping model parameters to a low count of just 2.07 million. Med-2D SegNet excels in cross-dataset generalization, particularly in polyp segmentation, where it was trained on KVASIR-SEG and showed strong performance on unseen datasets, demonstrating its robustness in zero-shot learning scenarios, even though we acknowledge that further improvements are possible. With top-tier performance in both binary and multi-class segmentation, Med-2D SegNet redefines the balance between accuracy and efficiency, setting a new benchmark for medical image analysis. This work paves the way for developing accessible, high-performance diagnostic tools suitable for clinical environments and resource-constrained settings, making it a step forward in the democratization of advanced medical technology.

Med-2D SegNet: A Light Weight Deep Neural Network for Medical 2D Image Segmentation

TL;DR

Med-2D SegNet introduces a lightweight encoder-decoder architecture built around the Med Block to achieve high-accuracy medical image segmentation with a small parameter footprint. The Med Block expands feature representations, applies depthwise spatial aggregation, and reduces channels to maintain efficiency, all connected via residual paths in a structured encoder–decoder. Across 20 diverse datasets, the model attains competitive Dice scores with only 2.07M parameters, demonstrating strong cross-domain performance and zero-shot capabilities, particularly in polyp segmentation. This work paves the way for deployable, high-performance segmentation tools in clinical environments and resource-limited settings, balancing accuracy, efficiency, and generalization.

Abstract

Accurate and efficient medical image segmentation is crucial for advancing clinical diagnostics and surgical planning, yet remains a complex challenge due to the variability in anatomical structures and the demand for low-complexity models. In this paper, we introduced Med-2D SegNet, a novel and highly efficient segmentation architecture that delivers outstanding accuracy while maintaining a minimal computational footprint. Med-2D SegNet achieves state-of-the-art performance across multiple benchmark datasets, including KVASIR-SEG, PH2, EndoVis, and GLAS, with an average Dice similarity coefficient (DSC) of 89.77% across 20 diverse datasets. Central to its success is the compact Med Block, a specialized encoder design that incorporates dimension expansion and parameter reduction, enabling precise feature extraction while keeping model parameters to a low count of just 2.07 million. Med-2D SegNet excels in cross-dataset generalization, particularly in polyp segmentation, where it was trained on KVASIR-SEG and showed strong performance on unseen datasets, demonstrating its robustness in zero-shot learning scenarios, even though we acknowledge that further improvements are possible. With top-tier performance in both binary and multi-class segmentation, Med-2D SegNet redefines the balance between accuracy and efficiency, setting a new benchmark for medical image analysis. This work paves the way for developing accessible, high-performance diagnostic tools suitable for clinical environments and resource-constrained settings, making it a step forward in the democratization of advanced medical technology.

Paper Structure

This paper contains 15 sections, 14 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Comparison of model parameters vs. DSC on the KVASIR-SEG dataset. Our proposed method achieved a 95.78% Dice score, outperforming competing models with the lowest parameter count, demonstrating an optimal balance between efficiency and segmentation accuracy.
  • Figure 2: The middle figure illustrates the Med-2D SegNet architecture with an encoder-decoder structure. The left block represents the Med Block, detailing each layer's functionality within the main architecture. The right block illustrates the dimension reduction stages in the encoder part. In the diagram, the green blocks denote the dimensional reduction phases, where image dimensions are progressively reduced by factors of $1/2$, $1/4$, $1/8$, and $1/16$, effectively compressing the spatial information. The red blocks represent the decoder stages, which progressively reconstruct the spatial dimensions while integrating residual connections to refine the segmentation output. Together, these components form the Med-2D SegNet's encoder and decoder structure, ensuring efficient feature extraction and accurate reconstruction.
  • Figure 3: (a) Grad-CAM visualizations highlighting key focus areas and model attention. (b) Performance comparison across different encoder architectures, illustrating trends in model performance. Detailed analysis is provided in the Supplementary Material Parameter Complexity section.
  • Figure 4: (a) Predicted results on a cross-domain dataset, demonstrating the strength and flexibility of our approach in handling diverse data. (b) Spider plot comparing the performance of our method with state-of-the-art techniques, highlighting its superior and comparable Dice score and exceptional performance across various evaluation metrics