Table of Contents
Fetching ...

Polyp segmentation in colonoscopy images using DeepLabV3++

Al Mohimanul Islam, Sadia Shakiba Bhuiyan, Mysun Mashira, Md. Rayhan Ahmed, Salekul Islam, Swakkhar Shatabda

TL;DR

DeepLabV3++ addresses the challenge of precise polyp segmentation in colonoscopy images by enhancing a proven DeepLabV3+ framework with an EfficientNetV2S encoder, a Multi-Scale Pyramid Pooling (MSPP) module, and a Parallel Attention Aggregation Block (PAAB). This design aims to improve boundary delineation and robustness across polyps of varying sizes while maintaining computational efficiency. Through evaluations on CVC-ColonDB, CVC-ClinicDB, and Kvasir-SEG, it achieves high Dice and mean IoU scores with fewer trainable parameters than the baseline, outperforming several state-of-the-art methods. The work demonstrates strong potential for clinical deployment and notes future work including model quantization, broader medical-image applications, and exploring transformer-based extensions.

Abstract

Segmenting polyps in colonoscopy images is essential for the early identification and diagnosis of colorectal cancer, a significant cause of worldwide cancer deaths. Prior deep learning based models such as Attention based variation, UNet variations and Transformer-derived networks have had notable success in capturing intricate features and complex polyp shapes. In this study, we have introduced the DeepLabv3++ model which is an enhanced version of the DeepLabv3+ architecture. It is designed to improve the precision and robustness of polyp segmentation in colonoscopy images. We have utilized The proposed model incorporates diverse separable convolutional layers and attention mechanisms within the MSPP block, enhancing its capacity to capture multi-scale and directional features. Additionally, the redesigned decoder further transforms the extracted features from the encoder into a more meaningful segmentation map. Our model was evaluated on three public datasets (CVC-ColonDB, CVC-ClinicDB, Kvasir-SEG) achieving Dice coefficient scores of 96.20%, 96.54%, and 96.08%, respectively. The experimental analysis shows that DeepLabV3++ outperforms several state-of-the-art models in polyp segmentation tasks. Furthermore, compared to the baseline DeepLabV3+ model, our DeepLabV3++ with its MSPP module and redesigned decoder architecture, significantly reduced segmentation errors (e.g., false positives/negatives) across small, medium, and large polyps. This improvement in polyp delineation is crucial for accurate clinical decision-making in colonoscopy.

Polyp segmentation in colonoscopy images using DeepLabV3++

TL;DR

DeepLabV3++ addresses the challenge of precise polyp segmentation in colonoscopy images by enhancing a proven DeepLabV3+ framework with an EfficientNetV2S encoder, a Multi-Scale Pyramid Pooling (MSPP) module, and a Parallel Attention Aggregation Block (PAAB). This design aims to improve boundary delineation and robustness across polyps of varying sizes while maintaining computational efficiency. Through evaluations on CVC-ColonDB, CVC-ClinicDB, and Kvasir-SEG, it achieves high Dice and mean IoU scores with fewer trainable parameters than the baseline, outperforming several state-of-the-art methods. The work demonstrates strong potential for clinical deployment and notes future work including model quantization, broader medical-image applications, and exploring transformer-based extensions.

Abstract

Segmenting polyps in colonoscopy images is essential for the early identification and diagnosis of colorectal cancer, a significant cause of worldwide cancer deaths. Prior deep learning based models such as Attention based variation, UNet variations and Transformer-derived networks have had notable success in capturing intricate features and complex polyp shapes. In this study, we have introduced the DeepLabv3++ model which is an enhanced version of the DeepLabv3+ architecture. It is designed to improve the precision and robustness of polyp segmentation in colonoscopy images. We have utilized The proposed model incorporates diverse separable convolutional layers and attention mechanisms within the MSPP block, enhancing its capacity to capture multi-scale and directional features. Additionally, the redesigned decoder further transforms the extracted features from the encoder into a more meaningful segmentation map. Our model was evaluated on three public datasets (CVC-ColonDB, CVC-ClinicDB, Kvasir-SEG) achieving Dice coefficient scores of 96.20%, 96.54%, and 96.08%, respectively. The experimental analysis shows that DeepLabV3++ outperforms several state-of-the-art models in polyp segmentation tasks. Furthermore, compared to the baseline DeepLabV3+ model, our DeepLabV3++ with its MSPP module and redesigned decoder architecture, significantly reduced segmentation errors (e.g., false positives/negatives) across small, medium, and large polyps. This improvement in polyp delineation is crucial for accurate clinical decision-making in colonoscopy.
Paper Structure (24 sections, 14 equations, 4 figures, 4 tables)

This paper contains 24 sections, 14 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Proposed DeepLabV3++ architecture
  • Figure 2: Architectural Composition of Parallel Attention Aggregation Block (PAAB)
  • Figure 3: Qualitative Comparison of Segmentation Results Using Unet3+, MultiResUnet, UnetR, Polyp-PVT, DeepLabV3+, and proposed DeepLabV3++ Models on CVC-ColonDB, CVC-ClinicDB and Kvasir-SEG datasets. Here, green, blue, and red boxes represent exemplary ROI and acceptable and unsatisfactory results.
  • Figure 4: Segmentation error comparison of Deeplabv3+ and proposed Deeplabv3++. Here, the highlighted area represents the X-OR difference between the actual and predicted masks.