Table of Contents
Fetching ...

MedKAN: An Advanced Kolmogorov-Arnold Network for Medical Image Classification

Zhuoqin Yang, Jiansong Zhang, Xiaoling Luo, Zheng Lu, Linlin Shen

TL;DR

MedKAN tackles the challenge of modeling both local textures and global context in medical images by leveraging Kolmogorov-Arnold Networks (KAN) with convolutional extensions. It introduces Local Information KAN (LIK) for local features and Global Information KAN (GIK) for global context, and provides MedKAN-S, MedKAN-B, and MedKAN-L variants for different compute budgets. Evaluations on nine MedMNIST datasets show MedKAN outperforms CNN- and Transformer-based baselines in ACC and AUC, with MedKAN-B achieving the best results. The results underscore the viability of KAN-based architectures for robust, data-efficient medical image classification and broader clinical deployment.

Abstract

Recent advancements in deep learning for image classification predominantly rely on convolutional neural networks (CNNs) or Transformer-based architectures. However, these models face notable challenges in medical imaging, particularly in capturing intricate texture details and contextual features. Kolmogorov-Arnold Networks (KANs) represent a novel class of architectures that enhance nonlinear transformation modeling, offering improved representation of complex features. In this work, we present MedKAN, a medical image classification framework built upon KAN and its convolutional extensions. MedKAN features two core modules: the Local Information KAN (LIK) module for fine-grained feature extraction and the Global Information KAN (GIK) module for global context integration. By combining these modules, MedKAN achieves robust feature modeling and fusion. To address diverse computational needs, we introduce three scalable variants--MedKAN-S, MedKAN-B, and MedKAN-L. Experimental results on nine public medical imaging datasets demonstrate that MedKAN achieves superior performance compared to CNN- and Transformer-based models, highlighting its effectiveness and generalizability in medical image analysis.

MedKAN: An Advanced Kolmogorov-Arnold Network for Medical Image Classification

TL;DR

MedKAN tackles the challenge of modeling both local textures and global context in medical images by leveraging Kolmogorov-Arnold Networks (KAN) with convolutional extensions. It introduces Local Information KAN (LIK) for local features and Global Information KAN (GIK) for global context, and provides MedKAN-S, MedKAN-B, and MedKAN-L variants for different compute budgets. Evaluations on nine MedMNIST datasets show MedKAN outperforms CNN- and Transformer-based baselines in ACC and AUC, with MedKAN-B achieving the best results. The results underscore the viability of KAN-based architectures for robust, data-efficient medical image classification and broader clinical deployment.

Abstract

Recent advancements in deep learning for image classification predominantly rely on convolutional neural networks (CNNs) or Transformer-based architectures. However, these models face notable challenges in medical imaging, particularly in capturing intricate texture details and contextual features. Kolmogorov-Arnold Networks (KANs) represent a novel class of architectures that enhance nonlinear transformation modeling, offering improved representation of complex features. In this work, we present MedKAN, a medical image classification framework built upon KAN and its convolutional extensions. MedKAN features two core modules: the Local Information KAN (LIK) module for fine-grained feature extraction and the Global Information KAN (GIK) module for global context integration. By combining these modules, MedKAN achieves robust feature modeling and fusion. To address diverse computational needs, we introduce three scalable variants--MedKAN-S, MedKAN-B, and MedKAN-L. Experimental results on nine public medical imaging datasets demonstrate that MedKAN achieves superior performance compared to CNN- and Transformer-based models, highlighting its effectiveness and generalizability in medical image analysis.

Paper Structure

This paper contains 13 sections, 5 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The input medical images are first processed by a basic convolutional module for initial feature extraction and dimensionality reduction, followed by a patch embedding step that further refines the spatial and channel dimensions. The processed features are then processed through a series of stacked modules, including the Local Information KAN (LIK) module for fine-grained feature extraction and the Global Information KAN (GIK) module for capturing long-range dependencies.
  • Figure 2: Grad-CAM visualizations and classification predictions of MedKAN and ResNet-50 on four datasets (BlM, DeM, OCTM, TiM). The ground truth and corresponding model predictions are displayed for each dataset. Grad-CAM heatmaps highlight the regions of interest identified by each model, demonstrating MedKAN’s improved localization and classification accuracy compared to ResNet-50.