Table of Contents
Fetching ...

Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation

Quang Vinh Nguyen, Van Thong Huynh, Soo-Hyung Kim

TL;DR

The paper addresses the challenge of polyp segmentation in colonoscopy images, where uncertain areas and background similarity hinder accurate delineation. It introduces ADSNet, a CNN-based framework with an EfficientNet-V2S encoder, a Complementary Trilateral Decoder that generates an early global map $M = \text{CTD}(f_1,f_2,f_3,f_4)$, and a Continuous Attention module that yields Background Semantic ($BS$) and Object Semantic ($OS$) to refine difficult regions, optimized by a joint $Loss(y,\hat{y}) = ACE(y,\hat{y}) + BCE(y,\hat{y})$. The approach achieves state-of-the-art Dice, IoU, and MAE on multiple polyp benchmarks and demonstrates strong generalization to unseen datasets, while remaining compatible with other CNN or Transformer encoders. By explicitly modeling uncertain areas and recovering weak features through $OS$ and $BS$, ADSNet enhances robustness and clinical utility in automated polyp segmentation.

Abstract

Colonoscopy is a common and practical method for detecting and treating polyps. Segmenting polyps from colonoscopy image is useful for diagnosis and surgery progress. Nevertheless, achieving excellent segmentation performance is still difficult because of polyp characteristics like shape, color, condition, and obvious non-distinction from the surrounding context. This work presents a new novel architecture namely Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation (ADSNet), which modifies misclassified details and recovers weak features having the ability to vanish and not be detected at the final stage. The architecture consists of a complementary trilateral decoder to produce an early global map. A continuous attention module modifies semantics of high-level features to analyze two separate semantics of the early global map. The suggested method is experienced on polyp benchmarks in learning ability and generalization ability, experimental results demonstrate the great correction and recovery ability leading to better segmentation performance compared to the other state of the art in the polyp image segmentation task. Especially, the proposed architecture could be experimented flexibly for other CNN-based encoders, Transformer-based encoders, and decoder backbones.

Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation

TL;DR

The paper addresses the challenge of polyp segmentation in colonoscopy images, where uncertain areas and background similarity hinder accurate delineation. It introduces ADSNet, a CNN-based framework with an EfficientNet-V2S encoder, a Complementary Trilateral Decoder that generates an early global map , and a Continuous Attention module that yields Background Semantic () and Object Semantic () to refine difficult regions, optimized by a joint . The approach achieves state-of-the-art Dice, IoU, and MAE on multiple polyp benchmarks and demonstrates strong generalization to unseen datasets, while remaining compatible with other CNN or Transformer encoders. By explicitly modeling uncertain areas and recovering weak features through and , ADSNet enhances robustness and clinical utility in automated polyp segmentation.

Abstract

Colonoscopy is a common and practical method for detecting and treating polyps. Segmenting polyps from colonoscopy image is useful for diagnosis and surgery progress. Nevertheless, achieving excellent segmentation performance is still difficult because of polyp characteristics like shape, color, condition, and obvious non-distinction from the surrounding context. This work presents a new novel architecture namely Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation (ADSNet), which modifies misclassified details and recovers weak features having the ability to vanish and not be detected at the final stage. The architecture consists of a complementary trilateral decoder to produce an early global map. A continuous attention module modifies semantics of high-level features to analyze two separate semantics of the early global map. The suggested method is experienced on polyp benchmarks in learning ability and generalization ability, experimental results demonstrate the great correction and recovery ability leading to better segmentation performance compared to the other state of the art in the polyp image segmentation task. Especially, the proposed architecture could be experimented flexibly for other CNN-based encoders, Transformer-based encoders, and decoder backbones.
Paper Structure (19 sections, 6 equations, 7 figures, 2 tables)

This paper contains 19 sections, 6 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Segmentation example of our model. Red boxes refer to uncertain areas. Purple boxes stand for noise details.
  • Figure 2: The proposed ADSNet architecture.
  • Figure 3: Continuous Attention.
  • Figure 4: Qualitative analysis on Kvasir-Seg and CVC-ClinicDB dataset of different models in several noteworthy cases.
  • Figure 5: Qualitative analysis on ETIS and CVC-ColonDB dataset of different models in several noteworthy cases.
  • ...and 2 more figures