Table of Contents
Fetching ...

GroupKAN: Rethinking Nonlinearity with Grouped Spline-based KAN Modeling for Efficient Medical Image Segmentation

Guojie Li, Anwar P. P. Abdul Majeed, Muhammad Ateeq, Anh Nguyen, Fan Zhang

TL;DR

This work tackles the need for accurate yet lightweight and interpretable medical image segmentation. It introduces GroupKAN, a backbone built on group-structured Kolmogorov–Arnold Networks that separate nonlinear activations from channel-wise transformations using Grouped KAN Activation (GKA) and Grouped KAN Transform (GKT), reducing transformation complexity from $O(C^2)$ to $O(C^2/G)$. Evaluated on BUSI, GlaS, and CVC-ClinicDB, GroupKAN achieves an average IoU of $79.80\%$, surpassing U-KAN by $1.11\%$ while using only $3.02$M parameters (47.6\% of U-KAN's 6.35M) and lower FLOPs, with additional improvements in activation-map plausibility. The results demonstrate a favorable accuracy–efficiency–interpretability trade-off and establish a scalable, group-aware nonlinear modeling paradigm for dense medical segmentation tasks.

Abstract

Medical image segmentation requires models that are accurate, lightweight, and interpretable. Convolutional architectures lack adaptive nonlinearity and transparent decision-making, whereas Transformer architectures are hindered by quadratic complexity and opaque attention mechanisms. U-KAN addresses these challenges using Kolmogorov-Arnold Networks, achieving higher accuracy than both convolutional and attention-based methods, fewer parameters than Transformer variants, and improved interpretability compared to conventional approaches. However, its O(C^2) complexity due to full-channel transformations limits its scalability as the number of channels increases. To overcome this, we introduce GroupKAN, a lightweight segmentation network that incorporates two novel, structured functional modules: (1) Grouped KAN Transform, which partitions channels into G groups for multivariate spline mappings, reducing complexity to O(C^2/G), and (2) Grouped KAN Activation, which applies shared spline-based mappings within each channel group for efficient, token-wise nonlinearity. Evaluated on three medical benchmarks (BUSI, GlaS, and CVC), GroupKAN achieves an average IoU of 79.80 percent, surpassing U-KAN by +1.11 percent while requiring only 47.6 percent of the parameters (3.02M vs 6.35M), and shows improved interpretability.

GroupKAN: Rethinking Nonlinearity with Grouped Spline-based KAN Modeling for Efficient Medical Image Segmentation

TL;DR

This work tackles the need for accurate yet lightweight and interpretable medical image segmentation. It introduces GroupKAN, a backbone built on group-structured Kolmogorov–Arnold Networks that separate nonlinear activations from channel-wise transformations using Grouped KAN Activation (GKA) and Grouped KAN Transform (GKT), reducing transformation complexity from to . Evaluated on BUSI, GlaS, and CVC-ClinicDB, GroupKAN achieves an average IoU of , surpassing U-KAN by while using only M parameters (47.6\% of U-KAN's 6.35M) and lower FLOPs, with additional improvements in activation-map plausibility. The results demonstrate a favorable accuracy–efficiency–interpretability trade-off and establish a scalable, group-aware nonlinear modeling paradigm for dense medical segmentation tasks.

Abstract

Medical image segmentation requires models that are accurate, lightweight, and interpretable. Convolutional architectures lack adaptive nonlinearity and transparent decision-making, whereas Transformer architectures are hindered by quadratic complexity and opaque attention mechanisms. U-KAN addresses these challenges using Kolmogorov-Arnold Networks, achieving higher accuracy than both convolutional and attention-based methods, fewer parameters than Transformer variants, and improved interpretability compared to conventional approaches. However, its O(C^2) complexity due to full-channel transformations limits its scalability as the number of channels increases. To overcome this, we introduce GroupKAN, a lightweight segmentation network that incorporates two novel, structured functional modules: (1) Grouped KAN Transform, which partitions channels into G groups for multivariate spline mappings, reducing complexity to O(C^2/G), and (2) Grouped KAN Activation, which applies shared spline-based mappings within each channel group for efficient, token-wise nonlinearity. Evaluated on three medical benchmarks (BUSI, GlaS, and CVC), GroupKAN achieves an average IoU of 79.80 percent, surpassing U-KAN by +1.11 percent while requiring only 47.6 percent of the parameters (3.02M vs 6.35M), and shows improved interpretability.

Paper Structure

This paper contains 22 sections, 4 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Accuracy–complexity trade-off in medical segmentation: IoU (%) vs. number of parameters (M) among common backbone models. Marker size reflects each model's computational cost, with areas proportional to their GFLOPs. The dashed line indicates the efficiency frontier. Our GroupKAN (red circle) improves IoU by 1.11% over U-KAN, with 47.6% of U-KAN's parameters.
  • Figure 2: Overview of the GroupKAN pipeline. The encoder extracts features through convolutional blocks. In the bottleneck, each Group ToK-KAN block applies patch embedding and Grouped KAN Activation, followed by $N$ repeated Grouped KAN Transforms with pointwise and depthwise convolutions. The decoder mirrors the encoder with upsampling and skip connections to produce the final segmentation map.
  • Figure 3: Illustration of Grouped KAN Activation. Given token embeddings $T \in \mathbb{R}^{B \times N \times C}$, we split the channel dimension into $G$ groups, each reshaped to $(B \cdot N) \times C_g$. A 1D KAN function $\Phi^{(g)}$ is then applied independently to each scalar dimension in every group. The outputs are reshaped and concatenated to yield the activated representation $T' \in \mathbb{R}^{B \times N \times C}$.
  • Figure 4: Illustration of Grouped KAN Transform. The input feature $X \in \mathbb{R}^{B \times N \times C}$ is divided into $G$ groups along the channel dimension. Each group is reshaped to $(B \cdot N) \times C_g$ and passed through a nonlinear transformation $\Phi^{(g)}$ with learnable mappings of size $C_g \times C_g$. The transformed outputs are reshaped and concatenated to form the output $Z \in \mathbb{R}^{B \times N \times C}$.
  • Figure 5: Qualitative comparison of segmentation results on BUSI, GlaS, and CVC-ClinicDB. GroupKAN produces more accurate and consistent masks across heterogeneous medical imaging modalities, closely aligning with the ground truth.
  • ...and 2 more figures