Table of Contents
Fetching ...

MNet-SAt: A Multiscale Network with Spatial-enhanced Attention for Segmentation of Polyps in Colonoscopy

Chandravardhan Singh Raghaw, Aryan Yadav, Jasmer Singh Sanjotra, Shalini Dangi, Nagendra Kumar

TL;DR

MNet-SAt introduces a novel encoder–decoder network for polyp segmentation that blends edge-guided refinement, multiscale feature aggregation, spatial-enhanced attention, and channel-aware ASPP. The Edge-Guided Feature Enrichment units preserve fine boundary details, while the Hybrid Multi-Scale Attention module learns robust spatial-global dependencies across scales. The Channel-Enhanced ASPP further recalibrates multiscale features, yielding state-of-the-art performance on Kvasir-SEG and CVC-ClinicDB with DSCs of 96.61% and 98.60%, respectively. Comprehensive ablations confirm the contributions of EGFE, MSFA, SEAt, and CE-ASPP, and cross-dataset tests demonstrate good generalization, underscoring potential clinical impact for early polyp detection and CRC mortality reduction.

Abstract

Objective: To develop a novel deep learning framework for the automated segmentation of colonic polyps in colonoscopy images, overcoming the limitations of current approaches in preserving precise polyp boundaries, incorporating multi-scale features, and modeling spatial dependencies that accurately reflect the intricate and diverse morphology of polyps. Methods: To address these limitations, we propose a novel Multiscale Network with Spatial-enhanced Attention (MNet-SAt) for polyp segmentation in colonoscopy images. This framework incorporates four key modules: Edge-Guided Feature Enrichment (EGFE) preserves edge information for improved boundary quality; Multi-Scale Feature Aggregator (MSFA) extracts and aggregates multi-scale features across channel spatial dimensions, focusing on salient regions; Spatial-Enhanced Attention (SEAt) captures spatial-aware global dependencies within the multi-scale aggregated features, emphasizing the region of interest; and Channel-Enhanced Atrous Spatial Pyramid Pooling (CE-ASPP) resamples and recalibrates attentive features across scales. Results: We evaluated MNet-SAt on the Kvasir-SEG and CVC-ClinicDB datasets, achieving Dice Similarity Coefficients of 96.61% and 98.60%, respectively. Conclusion: Both quantitative (DSC) and qualitative assessments highlight MNet-SAt's superior performance and generalization capabilities compared to existing methods. Significance: MNet-SAt's high accuracy in polyp segmentation holds promise for improving clinical workflows in early polyp detection and more effective treatment, contributing to reduced colorectal cancer mortality rates.

MNet-SAt: A Multiscale Network with Spatial-enhanced Attention for Segmentation of Polyps in Colonoscopy

TL;DR

MNet-SAt introduces a novel encoder–decoder network for polyp segmentation that blends edge-guided refinement, multiscale feature aggregation, spatial-enhanced attention, and channel-aware ASPP. The Edge-Guided Feature Enrichment units preserve fine boundary details, while the Hybrid Multi-Scale Attention module learns robust spatial-global dependencies across scales. The Channel-Enhanced ASPP further recalibrates multiscale features, yielding state-of-the-art performance on Kvasir-SEG and CVC-ClinicDB with DSCs of 96.61% and 98.60%, respectively. Comprehensive ablations confirm the contributions of EGFE, MSFA, SEAt, and CE-ASPP, and cross-dataset tests demonstrate good generalization, underscoring potential clinical impact for early polyp detection and CRC mortality reduction.

Abstract

Objective: To develop a novel deep learning framework for the automated segmentation of colonic polyps in colonoscopy images, overcoming the limitations of current approaches in preserving precise polyp boundaries, incorporating multi-scale features, and modeling spatial dependencies that accurately reflect the intricate and diverse morphology of polyps. Methods: To address these limitations, we propose a novel Multiscale Network with Spatial-enhanced Attention (MNet-SAt) for polyp segmentation in colonoscopy images. This framework incorporates four key modules: Edge-Guided Feature Enrichment (EGFE) preserves edge information for improved boundary quality; Multi-Scale Feature Aggregator (MSFA) extracts and aggregates multi-scale features across channel spatial dimensions, focusing on salient regions; Spatial-Enhanced Attention (SEAt) captures spatial-aware global dependencies within the multi-scale aggregated features, emphasizing the region of interest; and Channel-Enhanced Atrous Spatial Pyramid Pooling (CE-ASPP) resamples and recalibrates attentive features across scales. Results: We evaluated MNet-SAt on the Kvasir-SEG and CVC-ClinicDB datasets, achieving Dice Similarity Coefficients of 96.61% and 98.60%, respectively. Conclusion: Both quantitative (DSC) and qualitative assessments highlight MNet-SAt's superior performance and generalization capabilities compared to existing methods. Significance: MNet-SAt's high accuracy in polyp segmentation holds promise for improving clinical workflows in early polyp detection and more effective treatment, contributing to reduced colorectal cancer mortality rates.
Paper Structure (32 sections, 17 equations, 7 figures, 8 tables)

This paper contains 32 sections, 17 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Polyp patient demographics, image quality issues, and size distribution
  • Figure 2: MNet-SAt framework for polyp segmentation employs a U-shaped architecture with an encoder and decoder linked by skip connections. Edge-Guided Feature Enrichment (EGFE) units within each stage enable feature transfer by extracting features from polyp anatomy, enhancing the segmentation task. A Hybrid Multi-Scale Attention (HMAtt) module bridges the encoder and decoder, learning multi-scale features by prioritizing important regions.
  • Figure 3: Edge maps of the Sobel operator in x-y directions
  • Figure 4: Multi-Head Spatial-Enhanced Attention
  • Figure 5: Channel-Enhanced Atrous Spatial Pyramid Pooling
  • ...and 2 more figures