Table of Contents
Fetching ...

ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU

Qi Qiu, Tao Zhu, Helin Gong, Liming Chen, Huansheng Ning

TL;DR

This work tackles the computational bottleneck of Kolmogorov-Arnold Networks caused by the B-spline basis by introducing ReLU-KAN, a GPU-friendly architecture that replaces B-splines with a trainable, ReLU-based basis built from matrix operations. The method maintains the core KAN properties, including resistance to catastrophic forgetting, while achieving substantial speedups (up to 20x) and improved fitting accuracy (1–3 orders of magnitude) in larger networks. The approach reformulates the basis computation into efficient linear algebra operations and convolution, enabling seamless integration with PyTorch for both training and inference. Overall, ReLU-KAN demonstrates that simpler, trainable basis functions can offer significant practical gains in both performance and scalability for KANs, with potential for further exploration of alternative bases.

Abstract

Limited by the complexity of basis function (B-spline) calculations, Kolmogorov-Arnold Networks (KAN) suffer from restricted parallel computing capability on GPUs. This paper proposes a novel ReLU-KAN implementation that inherits the core idea of KAN. By adopting ReLU (Rectified Linear Unit) and point-wise multiplication, we simplify the design of KAN's basis function and optimize the computation process for efficient CUDA computing. The proposed ReLU-KAN architecture can be readily implemented on existing deep learning frameworks (e.g., PyTorch) for both inference and training. Experimental results demonstrate that ReLU-KAN achieves a 20x speedup compared to traditional KAN with 4-layer networks. Furthermore, ReLU-KAN exhibits a more stable training process with superior fitting ability while preserving the "catastrophic forgetting avoidance" property of KAN. You can get the code in https://github.com/quiqi/relu_kan

ReLU-KAN: New Kolmogorov-Arnold Networks that Only Need Matrix Addition, Dot Multiplication, and ReLU

TL;DR

This work tackles the computational bottleneck of Kolmogorov-Arnold Networks caused by the B-spline basis by introducing ReLU-KAN, a GPU-friendly architecture that replaces B-splines with a trainable, ReLU-based basis built from matrix operations. The method maintains the core KAN properties, including resistance to catastrophic forgetting, while achieving substantial speedups (up to 20x) and improved fitting accuracy (1–3 orders of magnitude) in larger networks. The approach reformulates the basis computation into efficient linear algebra operations and convolution, enabling seamless integration with PyTorch for both training and inference. Overall, ReLU-KAN demonstrates that simpler, trainable basis functions can offer significant practical gains in both performance and scalability for KANs, with potential for further exploration of alternative bases.

Abstract

Limited by the complexity of basis function (B-spline) calculations, Kolmogorov-Arnold Networks (KAN) suffer from restricted parallel computing capability on GPUs. This paper proposes a novel ReLU-KAN implementation that inherits the core idea of KAN. By adopting ReLU (Rectified Linear Unit) and point-wise multiplication, we simplify the design of KAN's basis function and optimize the computation process for efficient CUDA computing. The proposed ReLU-KAN architecture can be readily implemented on existing deep learning frameworks (e.g., PyTorch) for both inference and training. Experimental results demonstrate that ReLU-KAN achieves a 20x speedup compared to traditional KAN with 4-layer networks. Furthermore, ReLU-KAN exhibits a more stable training process with superior fitting ability while preserving the "catastrophic forgetting avoidance" property of KAN. You can get the code in https://github.com/quiqi/relu_kan
Paper Structure (13 sections, 10 equations, 7 figures, 7 tables)

This paper contains 13 sections, 10 equations, 7 figures, 7 tables.

Figures (7)

  • Figure 1: Kolmogorov-Arnold representation theorem can be presented a two-layer structure
  • Figure 2: The $i^{th}$ B-spline.
  • Figure 3: Appearance of $\boldsymbol{B}$ for the case of $G = 5$ and $k = 3$.
  • Figure 4: The construction of $R_i$
  • Figure 5: Appearance of $\boldsymbol{R}$ for the case of $G = 5$ and $k = 3$.
  • ...and 2 more figures