TruKAN: Towards More Efficient Kolmogorov-Arnold Networks Using Truncated Power Functions
Ali Bayeh, Samira Sadaoui, Malek Mouhoub
TL;DR
TruKAN introduces a fast, interpretable alternative to traditional KAN activations by replacing B-spline bases with a truncated power function basis complemented by a trainable polynomial backbone. The architecture maintains KAN topology while enabling shared or per-output knots, yielding improved accuracy, memory efficiency, and training speed on a suite of vision benchmarks when integrated into an EfficientNet-V2 framework. Through systematic comparisons with MLP, KAN, SineKAN, and TruKAN variants across small and large networks, the study demonstrates robust performance gains, especially when paired with layer normalization. The work also provides a theoretical alignment with KAN via the equivalence of basis spaces and discusses practical considerations such as numerical stability and normalization to ensure stable training. Overall, TruKAN offers a scalable, interpretable, and efficient path for edge-wise activation learning in high-performance vision models, with clear avenues for future exploration in mixed-precision training and broader applications.
Abstract
To address the trade-off between computational efficiency and adherence to Kolmogorov-Arnold Network (KAN) principles, we propose TruKAN, a new architecture based on the KAN structure and learnable activation functions. TruKAN replaces the B-spline basis in KAN with a family of truncated power functions derived from k-order spline theory. This change maintains the KAN's expressiveness while enhancing accuracy and training time. Each TruKAN layer combines a truncated power term with a polynomial term and employs either shared or individual knots. TruKAN exhibits greater interpretability than other KAN variants due to its simplified basis functions and knot configurations. By prioritizing interpretable basis functions, TruKAN aims to balance approximation efficacy with transparency. We develop the TruKAN model and integrate it into an advanced EfficientNet-V2-based framework, which is then evaluated on computer vision benchmark datasets. To ensure a fair comparison, we develop various models: MLP-, KAN-, SineKAN and TruKAN-based EfficientNet frameworks and assess their training time and accuracy across small and deep architectures. The training phase uses hybrid optimization to improve convergence stability. Additionally, we investigate layer normalization techniques for all the models and assess the impact of shared versus individual knots in TruKAN. Overall, TruKAN outperforms other KAN models in terms of accuracy, computational efficiency and memory usage on the complex vision task, demonstrating advantages beyond the limited settings explored in prior KAN studies.
