Table of Contents
Fetching ...

MicroAUNet: Boundary-Enhanced Multi-scale Fusion with Knowledge Distillation for Colonoscopy Polyp Image Segmentation

Ziyi Wang, Yuanmei Zhang, Dorna Esrafilzadeh, Ali R. Jalili, Suncheng Xiang

TL;DR

MicroAUNet tackles real-time colonoscopy polyp segmentation with precise boundary delineation by integrating boundary-enhanced boundary-aware multi-scale feature fusion and a light-weight single-path attention mechanism. A progressive two-stage knowledge-distillation framework transfers semantic and boundary cues from a high-capacity teacher (MALUNet), enabling strong performance in a compact model (0.0249M parameters, 0.148 GFLOPs). Experiments on Kvasir-SEG and CVC-ClinicDB show state-of-the-art accuracy for ultra-light models, validating its suitability for real-time clinical use. The work provides open-source code and demonstrates how boundary-focused design combined with distillation can achieve robust, efficient polyp segmentation across datasets.

Abstract

Early and accurate segmentation of colorectal polyps is critical for reducing colorectal cancer mortality, which has been extensively explored by academia and industry. However, current deep learning-based polyp segmentation models either compromise clinical decision-making by providing ambiguous polyp margins in segmentation outputs or rely on heavy architectures with high computational complexity, resulting in insufficient inference speeds for real-time colorectal endoscopic applications. To address this problem, we propose MicroAUNet, a light-weighted attention-based segmentation network that combines depthwise-separable dilated convolutions with a single-path, parameter-shared channel-spatial attention block to strengthen multi-scale boundary features. On the basis of it, a progressive two-stage knowledge-distillation scheme is introduced to transfer semantic and boundary cues from a high-capacity teacher. Extensive experiments on benchmarks also demonstrate the state-of-the-art accuracy under extremely low model complexity, indicating that MicroAUNet is suitable for real-time clinical polyp segmentation. The code is publicly available at https://github.com/JeremyXSC/MicroAUNet.

MicroAUNet: Boundary-Enhanced Multi-scale Fusion with Knowledge Distillation for Colonoscopy Polyp Image Segmentation

TL;DR

MicroAUNet tackles real-time colonoscopy polyp segmentation with precise boundary delineation by integrating boundary-enhanced boundary-aware multi-scale feature fusion and a light-weight single-path attention mechanism. A progressive two-stage knowledge-distillation framework transfers semantic and boundary cues from a high-capacity teacher (MALUNet), enabling strong performance in a compact model (0.0249M parameters, 0.148 GFLOPs). Experiments on Kvasir-SEG and CVC-ClinicDB show state-of-the-art accuracy for ultra-light models, validating its suitability for real-time clinical use. The work provides open-source code and demonstrates how boundary-focused design combined with distillation can achieve robust, efficient polyp segmentation across datasets.

Abstract

Early and accurate segmentation of colorectal polyps is critical for reducing colorectal cancer mortality, which has been extensively explored by academia and industry. However, current deep learning-based polyp segmentation models either compromise clinical decision-making by providing ambiguous polyp margins in segmentation outputs or rely on heavy architectures with high computational complexity, resulting in insufficient inference speeds for real-time colorectal endoscopic applications. To address this problem, we propose MicroAUNet, a light-weighted attention-based segmentation network that combines depthwise-separable dilated convolutions with a single-path, parameter-shared channel-spatial attention block to strengthen multi-scale boundary features. On the basis of it, a progressive two-stage knowledge-distillation scheme is introduced to transfer semantic and boundary cues from a high-capacity teacher. Extensive experiments on benchmarks also demonstrate the state-of-the-art accuracy under extremely low model complexity, indicating that MicroAUNet is suitable for real-time clinical polyp segmentation. The code is publicly available at https://github.com/JeremyXSC/MicroAUNet.

Paper Structure

This paper contains 17 sections, 7 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Motivation and overall design concept of the proposed MicroAUNet framework. It illustrates the limitations of existing polyp segmentation models—blurred boundaries and high computational costs, and this paper integrates boundary enhancement, light-weighted attention, and knowledge distillation to overcome these challenges.
  • Figure 2: Overview of the proposed MicroAUNet, which enhances boundary perception and achieves light-weighted segmentation through depthwise separable convolutions, shared attention, and progressive two-stage knowledge distillation.
  • Figure 3: Quantitative comparison of different segmentation models on Kvasir and CVC datasets. Radar charts visualise the performance (mDice, mIoU, Accuracy, Specificity, and Sensitivity) of UNet, SANet, UNeXt, MALUNet, and the proposed MicroAUNet, demonstrating the superior accuracy–efficiency trade-off of MicroAUNet.
  • Figure 4: Comprehensive analysis of efficiency and performance across models. Scatter plots compare parameter count and computational complexity (Params and FLOPs) versus segmentation accuracy, illustrating that MicroAUNet achieves the optimal balance among light-weighted and full-scale architectures.
  • Figure 5: Visual comparison between the proposed method and four state-of-the-art ones. Visual examples show that MicroAUNet produces sharper, more continuous boundaries and fewer background artefacts compared with UNet, SANet, UNeXt, and MALUNet, yielding results closest to the ground truth.