Table of Contents
Fetching ...

A hybrid Kolmogorov-Arnold network for medical image segmentation

Deep Bhattacharyya, Ali Ayub, A. Ben Hamza

TL;DR

U-KABS, a novel hybrid framework that integrates the expressive power of Kolmogorov-Arnold Networks with a U-shaped encoder-decoder architecture to enhance segmentation performance, demonstrates superior performance compared to strong baselines, particularly in segmenting complex anatomical structures.

Abstract

Medical image segmentation plays a vital role in diagnosis and treatment planning, but remains challenging due to the inherent complexity and variability of medical images, especially in capturing non-linear relationships within the data. We propose U-KABS, a novel hybrid framework that integrates the expressive power of Kolmogorov-Arnold Networks (KANs) with a U-shaped encoder-decoder architecture to enhance segmentation performance. The U-KABS model combines the convolutional and squeeze-and-excitation stage, which enhances channel-wise feature representations, and the KAN Bernstein Spline (KABS) stage, which employs learnable activation functions based on Bernstein polynomials and B-splines. This hybrid design leverages the global smoothness of Bernstein polynomials and the local adaptability of B-splines, enabling the model to effectively capture both broad contextual trends and fine-grained patterns critical for delineating complex structures in medical images. Skip connections between encoder and decoder layers support effective multi-scale feature fusion and preserve spatial details. Evaluated across diverse medical imaging benchmark datasets, U-KABS demonstrates superior performance compared to strong baselines, particularly in segmenting complex anatomical structures.

A hybrid Kolmogorov-Arnold network for medical image segmentation

TL;DR

U-KABS, a novel hybrid framework that integrates the expressive power of Kolmogorov-Arnold Networks with a U-shaped encoder-decoder architecture to enhance segmentation performance, demonstrates superior performance compared to strong baselines, particularly in segmenting complex anatomical structures.

Abstract

Medical image segmentation plays a vital role in diagnosis and treatment planning, but remains challenging due to the inherent complexity and variability of medical images, especially in capturing non-linear relationships within the data. We propose U-KABS, a novel hybrid framework that integrates the expressive power of Kolmogorov-Arnold Networks (KANs) with a U-shaped encoder-decoder architecture to enhance segmentation performance. The U-KABS model combines the convolutional and squeeze-and-excitation stage, which enhances channel-wise feature representations, and the KAN Bernstein Spline (KABS) stage, which employs learnable activation functions based on Bernstein polynomials and B-splines. This hybrid design leverages the global smoothness of Bernstein polynomials and the local adaptability of B-splines, enabling the model to effectively capture both broad contextual trends and fine-grained patterns critical for delineating complex structures in medical images. Skip connections between encoder and decoder layers support effective multi-scale feature fusion and preserve spatial details. Evaluated across diverse medical imaging benchmark datasets, U-KABS demonstrates superior performance compared to strong baselines, particularly in segmenting complex anatomical structures.
Paper Structure (14 sections, 11 equations, 5 figures, 9 tables)

This paper contains 14 sections, 11 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Overall architecture of the proposed U-KABS framework. The model follows a symmetrically U-shaped encoder-decoder structure, integrating Convolutional and Squeeze-and-Excitation (ConvSE) and KAN Bernstein Spline (KABS) stages. The encoder, comprising three ConvSE blocks followed by two KABS blocks, progressively downsamples the input image to learn hierarchical features, increasing the number of feature channels while halving spatial resolution at each block. The decoder mirrors the design of the encoder, upsampling features via bilinear interpolation to restore the feature map spatial resolution and generate a pixel-wise segmentation mask.
  • Figure 2: Qualitative evaluation of our method using heatmaps compared to benchmark baselines on the BUSI dataset.
  • Figure 3: Qualitative comparison of our method with benchmark baselines on the ISIC 2018 dataset, highlighting segmentation errors. White, red, and green regions represent predicted segmentation, over-segmentation, and under-segmentation, respectively
  • Figure 4: Qualitative comparison of our model with baselines on the ACDC dataset, where RV, Myo, LV represent Right Ventricle, Myocardium and Left Ventricle, respectively.
  • Figure 5: Comparison of performance and model efficiency on the BUSI dataset. Our U-KABS model is benchmarked against state-of-the-art methods, including Attention-UNet Oktay2018, MedT Valanarasu2021, UNeXt Valanarasu2022, Rolling-Unet Liu2024, U-KAN li2024ukan, and ResU-KAN wang2025resu. Performance is assessed using the DSC metric, where higher values indicate superior segmentation performance. The size of the circle indicate the number of learnable parameters..