Table of Contents
Fetching ...

TruKAN: Towards More Efficient Kolmogorov-Arnold Networks Using Truncated Power Functions

Ali Bayeh, Samira Sadaoui, Malek Mouhoub

TL;DR

TruKAN introduces a fast, interpretable alternative to traditional KAN activations by replacing B-spline bases with a truncated power function basis complemented by a trainable polynomial backbone. The architecture maintains KAN topology while enabling shared or per-output knots, yielding improved accuracy, memory efficiency, and training speed on a suite of vision benchmarks when integrated into an EfficientNet-V2 framework. Through systematic comparisons with MLP, KAN, SineKAN, and TruKAN variants across small and large networks, the study demonstrates robust performance gains, especially when paired with layer normalization. The work also provides a theoretical alignment with KAN via the equivalence of basis spaces and discusses practical considerations such as numerical stability and normalization to ensure stable training. Overall, TruKAN offers a scalable, interpretable, and efficient path for edge-wise activation learning in high-performance vision models, with clear avenues for future exploration in mixed-precision training and broader applications.

Abstract

To address the trade-off between computational efficiency and adherence to Kolmogorov-Arnold Network (KAN) principles, we propose TruKAN, a new architecture based on the KAN structure and learnable activation functions. TruKAN replaces the B-spline basis in KAN with a family of truncated power functions derived from k-order spline theory. This change maintains the KAN's expressiveness while enhancing accuracy and training time. Each TruKAN layer combines a truncated power term with a polynomial term and employs either shared or individual knots. TruKAN exhibits greater interpretability than other KAN variants due to its simplified basis functions and knot configurations. By prioritizing interpretable basis functions, TruKAN aims to balance approximation efficacy with transparency. We develop the TruKAN model and integrate it into an advanced EfficientNet-V2-based framework, which is then evaluated on computer vision benchmark datasets. To ensure a fair comparison, we develop various models: MLP-, KAN-, SineKAN and TruKAN-based EfficientNet frameworks and assess their training time and accuracy across small and deep architectures. The training phase uses hybrid optimization to improve convergence stability. Additionally, we investigate layer normalization techniques for all the models and assess the impact of shared versus individual knots in TruKAN. Overall, TruKAN outperforms other KAN models in terms of accuracy, computational efficiency and memory usage on the complex vision task, demonstrating advantages beyond the limited settings explored in prior KAN studies.

TruKAN: Towards More Efficient Kolmogorov-Arnold Networks Using Truncated Power Functions

TL;DR

TruKAN introduces a fast, interpretable alternative to traditional KAN activations by replacing B-spline bases with a truncated power function basis complemented by a trainable polynomial backbone. The architecture maintains KAN topology while enabling shared or per-output knots, yielding improved accuracy, memory efficiency, and training speed on a suite of vision benchmarks when integrated into an EfficientNet-V2 framework. Through systematic comparisons with MLP, KAN, SineKAN, and TruKAN variants across small and large networks, the study demonstrates robust performance gains, especially when paired with layer normalization. The work also provides a theoretical alignment with KAN via the equivalence of basis spaces and discusses practical considerations such as numerical stability and normalization to ensure stable training. Overall, TruKAN offers a scalable, interpretable, and efficient path for edge-wise activation learning in high-performance vision models, with clear avenues for future exploration in mixed-precision training and broader applications.

Abstract

To address the trade-off between computational efficiency and adherence to Kolmogorov-Arnold Network (KAN) principles, we propose TruKAN, a new architecture based on the KAN structure and learnable activation functions. TruKAN replaces the B-spline basis in KAN with a family of truncated power functions derived from k-order spline theory. This change maintains the KAN's expressiveness while enhancing accuracy and training time. Each TruKAN layer combines a truncated power term with a polynomial term and employs either shared or individual knots. TruKAN exhibits greater interpretability than other KAN variants due to its simplified basis functions and knot configurations. By prioritizing interpretable basis functions, TruKAN aims to balance approximation efficacy with transparency. We develop the TruKAN model and integrate it into an advanced EfficientNet-V2-based framework, which is then evaluated on computer vision benchmark datasets. To ensure a fair comparison, we develop various models: MLP-, KAN-, SineKAN and TruKAN-based EfficientNet frameworks and assess their training time and accuracy across small and deep architectures. The training phase uses hybrid optimization to improve convergence stability. Additionally, we investigate layer normalization techniques for all the models and assess the impact of shared versus individual knots in TruKAN. Overall, TruKAN outperforms other KAN models in terms of accuracy, computational efficiency and memory usage on the complex vision task, demonstrating advantages beyond the limited settings explored in prior KAN studies.
Paper Structure (22 sections, 7 equations, 5 figures, 5 tables)

This paper contains 22 sections, 7 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: An example of KAN architecture of form $\varphi_\mathbf{t}(x) = \sum_j c_j B_{j,\mathbf{t}}(x)$ according to KAN. The left side presents the notation for activations through the network. The right side depicts an activation function modeled with B-spline. The parameterization of the activation function allows adaptive switching between coarse and fine grid resolutions.
  • Figure 2: An example of the structure of a TruKAN layer: a truncated power component and a polynomial component: (a) shows a layer with fixed knots, meaning all the knot locations are defined based on equal intervals (here 6) and remain fixed during training; (b) shows a TruKAN layer with learnable knots, meaning the position of the knots will change during training (here 6 variable intervals); all the knot locations are adjusted during training to guarantee an ordered, positive and incremental positioning.
  • Figure 3: The illustration of KAN and TruKAN models, along with their pruned versions. (a) Learned activation functions in the KAN model; (b) Pruned KAN model after applying a magnitude threshold of 0.3; (c) Learned activation functions in the TruKAN model, where the composite activation (solid black) is decomposed into its polynomial term and truncated power term (dashed lines in distinct colors); (d) Pruned TruKAN model after the same thresholding.
  • Figure 4: EfficientNet-V2 based framework to build eight different vision models.
  • Figure 5: Selected classifiers: MLP, MLP+Normalization, KAN, KAN+Normalization, SineKAN, SineKAn+Normalization, TruKAN and TruKAN+Normalization.