Table of Contents
Fetching ...

Hardware-Oriented Inference Complexity of Kolmogorov-Arnold Networks

Bilal Khalid, Pedro Freire, Sergei K. Turitsyn, Jaroslaw E. Prilepsky

Abstract

Kolmogorov-Arnold Networks (KANs) have recently emerged as a powerful architecture for various machine learning applications. However, their unique structure raises significant concerns regarding their computational overhead. Existing studies primarily evaluate KAN complexity in terms of Floating-Point Operations (FLOPs) required for GPU-based training and inference. However, in many latency-sensitive and power-constrained deployment scenarios, such as neural network-driven non-linearity mitigation in optical communications or channel state estimation in wireless communications, training is performed offline and dedicated hardware accelerators are preferred over GPUs for inference. Recent hardware implementation studies report KAN complexity using platform-specific resource consumption metrics, such as Look-Up Tables, Flip-Flops, and Block RAMs. However, these metrics require a full hardware design and synthesis stage that limits their utility for early-stage architectural decisions and cross-platform comparisons. To address this, we derive generalized, platform-independent formulae for evaluating the hardware inference complexity of KANs in terms of Real Multiplications (RM), Bit Operations (BOP), and Number of Additions and Bit-Shifts (NABS). We extend our analysis across multiple KAN variants, including B-spline, Gaussian Radial Basis Function (GRBF), Chebyshev, and Fourier KANs. The proposed metrics can be computed directly from the network structure and enable a fair and straightforward inference complexity comparison between KAN and other neural network architectures.

Hardware-Oriented Inference Complexity of Kolmogorov-Arnold Networks

Abstract

Kolmogorov-Arnold Networks (KANs) have recently emerged as a powerful architecture for various machine learning applications. However, their unique structure raises significant concerns regarding their computational overhead. Existing studies primarily evaluate KAN complexity in terms of Floating-Point Operations (FLOPs) required for GPU-based training and inference. However, in many latency-sensitive and power-constrained deployment scenarios, such as neural network-driven non-linearity mitigation in optical communications or channel state estimation in wireless communications, training is performed offline and dedicated hardware accelerators are preferred over GPUs for inference. Recent hardware implementation studies report KAN complexity using platform-specific resource consumption metrics, such as Look-Up Tables, Flip-Flops, and Block RAMs. However, these metrics require a full hardware design and synthesis stage that limits their utility for early-stage architectural decisions and cross-platform comparisons. To address this, we derive generalized, platform-independent formulae for evaluating the hardware inference complexity of KANs in terms of Real Multiplications (RM), Bit Operations (BOP), and Number of Additions and Bit-Shifts (NABS). We extend our analysis across multiple KAN variants, including B-spline, Gaussian Radial Basis Function (GRBF), Chebyshev, and Fourier KANs. The proposed metrics can be computed directly from the network structure and enable a fair and straightforward inference complexity comparison between KAN and other neural network architectures.

Paper Structure

This paper contains 25 sections, 47 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: A KAN layer with two input and three output nodes. Unlike MLPs where activations reside at nodes, each KAN edge contains a learnable activation function.
  • Figure 2: Basis functions commonly used in KANs: (a) B-splines; (b) Gaussian radial basis functions (GRBF); (c) Chebyshev polynomials; (d) Fourier basis.
  • Figure 3: Comparison of hardware inference complexity for MLP and KAN variants using architecture $[3, 16, 16, 2]$. All KAN variants use representative parameters: B-spline ($k=3, G=5$), GRBF ($N_c=5$), Chebyshev ($n=5$), and Fourier ($G=5$). Results show (a) Real Multiplications (RM), (b) Bit Operations (BOP), and (c) Number of Additions and Bit Shifts (NABS).
  • Figure 4: Complexity scaling with network width for architecture $[3, X, X, 2]$ where $X$ varies from 4 to 64. All networks exhibit quadratic scaling dominated by the $X \times X$ hidden layer. The computational overhead ratio between each KAN variant and MLP remains constant across network sizes.
  • Figure 5: Iso-complexity analysis showing the required hidden layer width $X$ for KAN architecture $[3, X, X, 2]$ to match the computational cost of MLP baseline $[3, 64, 64, 2]$ across the three metrics i.e. (a) Real Multiplications (RM), (b) Bit Operations (BOP), and (c) Additions and Bit Shifts (NABS).