Table of Contents
Fetching ...

U-MAN: U-Net with Multi-scale Adaptive KAN Network for Medical Image Segmentation

Bohan Huang, Qianyun Bao, Haoyuan Ma

TL;DR

The paper targets accurate medical image segmentation where fine details and precise boundaries are hard to preserve. It proposes U-MAN, a U-Net variant that combines Progressive Attention-Guided Feature Fusion (PAGF) with a Multi-scale Adaptive KAN (MAN) module to bridge the encoder–decoder semantic gap and enable adaptive multi-scale feature extraction. Across BUSI, GLAS, and CVC-ClinicDB, U-MAN delivers state-of-the-art IoU and F1 scores, with IoU improvements over a U-KAN baseline and strong ablation support showing the contributions of both PAGF and MAN. The method offers a practical pathway to improved boundary delineation and detail preservation in diverse medical imaging modalities, with potential for broad clinical impact.

Abstract

Medical image segmentation faces significant challenges in preserving fine-grained details and precise boundaries due to complex anatomical structures and pathological regions. These challenges primarily stem from two key limitations of conventional U-Net architectures: (1) their simple skip connections ignore the encoder-decoder semantic gap between various features, and (2) they lack the capability for multi-scale feature extraction in deep layers. To address these challenges, we propose the U-Net with Multi-scale Adaptive KAN (U-MAN), a novel architecture that enhances the emerging Kolmogorov-Arnold Network (KAN) with two specialized modules: Progressive Attention-Guided Feature Fusion (PAGF) and the Multi-scale Adaptive KAN (MAN). Our PAGF module replaces the simple skip connection, using attention to fuse features from the encoder and decoder. The MAN module enables the network to adaptively process features at multiple scales, improving its ability to segment objects of various sizes. Experiments on three public datasets (BUSI, GLAS, and CVC) show that U-MAN outperforms state-of-the-art methods, particularly in defining accurate boundaries and preserving fine details.

U-MAN: U-Net with Multi-scale Adaptive KAN Network for Medical Image Segmentation

TL;DR

The paper targets accurate medical image segmentation where fine details and precise boundaries are hard to preserve. It proposes U-MAN, a U-Net variant that combines Progressive Attention-Guided Feature Fusion (PAGF) with a Multi-scale Adaptive KAN (MAN) module to bridge the encoder–decoder semantic gap and enable adaptive multi-scale feature extraction. Across BUSI, GLAS, and CVC-ClinicDB, U-MAN delivers state-of-the-art IoU and F1 scores, with IoU improvements over a U-KAN baseline and strong ablation support showing the contributions of both PAGF and MAN. The method offers a practical pathway to improved boundary delineation and detail preservation in diverse medical imaging modalities, with potential for broad clinical impact.

Abstract

Medical image segmentation faces significant challenges in preserving fine-grained details and precise boundaries due to complex anatomical structures and pathological regions. These challenges primarily stem from two key limitations of conventional U-Net architectures: (1) their simple skip connections ignore the encoder-decoder semantic gap between various features, and (2) they lack the capability for multi-scale feature extraction in deep layers. To address these challenges, we propose the U-Net with Multi-scale Adaptive KAN (U-MAN), a novel architecture that enhances the emerging Kolmogorov-Arnold Network (KAN) with two specialized modules: Progressive Attention-Guided Feature Fusion (PAGF) and the Multi-scale Adaptive KAN (MAN). Our PAGF module replaces the simple skip connection, using attention to fuse features from the encoder and decoder. The MAN module enables the network to adaptively process features at multiple scales, improving its ability to segment objects of various sizes. Experiments on three public datasets (BUSI, GLAS, and CVC) show that U-MAN outperforms state-of-the-art methods, particularly in defining accurate boundaries and preserving fine details.

Paper Structure

This paper contains 18 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overall architecture of U-MAN. The encoder contains Convolution Blocks for shallow feature extraction andMAN modules perform adaptive multi-scale processing in the deeper layers. Critically, each skip connection is enhanced by a PAGF Module to intelligently fuse features from the encoder and decoder paths.
  • Figure 2: Architecture of MAN module. The structure employs dual-branch processing with KAN blocks for learnable activation-based feature processing and MSAB blocks for multi-scale attention feature extraction, combined through adaptive fusion mechanisms.
  • Figure 3: Architecture of PAGF module. The structure employs dual-path attention with channel and spatial mechanisms, followed by gated fusion for intelligent encoder-decoder feature combination.
  • Figure 4: Qualitative segmentation results of U-MAN across three medical imaging datasets. The top row shows original images from BUSI, CVC-ClinicDB, and GLAS. The bottom row presents corresponding binary segmentation masks generated by our method, demonstrating accurate boundary delineation across diverse medical imaging modalities.