Table of Contents
Fetching ...

SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions

Eric A. F. Reinhardt, P. R. Dinesh, Sergei Gleyzer

TL;DR

SineKAN introduces a sinusoidal-activation-based Kolmogorov-Arnold Network (KAN) that replaces B-Spline edges with grids of learnable sine functions. The layer output follows $y_i = \sum_{j,k} A_{ijk} \sin(\omega_k x_j + \phi_{jk}) + b_i$, with frequencies $\omega_k$, phases $\phi_{jk}$, and amplitudes $A_{ijk}$ learned on a grid, alongside a phase-scaling initialization to stabilize deep models. Empirically, SineKAN achieves higher MNIST accuracy and substantially faster inference than B-SplineKAN and FourierKAN, while maintaining competitive performance with MLPs under optimized conditions; it also demonstrates partial resistance to catastrophic forgetting in continual learning and favorable generalization of periodic patterns. The work discusses practical scaling, limitations (e.g., grid expandability, symbolic expressions), and potential for integration into larger architectures, highlighting SineKAN as a fast, scalable alternative within the KAN framework.

Abstract

Recent work has established an alternative to traditional multi-layer perceptron neural networks in the form of Kolmogorov-Arnold Networks (KAN). The general KAN framework uses learnable activation functions on the edges of the computational graph followed by summation on nodes. The learnable edge activation functions in the original implementation are basis spline functions (B-Spline). Here, we present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions (SineKAN). We evaluate numerical performance of our model on a benchmark vision task. We show that our model can perform better than or comparable to B-Spline KAN models and an alternative KAN implementation based on periodic cosine and sine functions representing a Fourier Series. Further, we show that SineKAN has numerical accuracy that could scale comparably to dense neural networks (DNNs). Compared to the two baseline KAN models, SineKAN achieves a substantial speed increase at all hidden layer sizes, batch sizes, and depths. Current advantage of DNNs due to hardware and software optimizations are discussed along with theoretical scaling. Additionally, properties of SineKAN compared to other KAN implementations and current limitations are also discussed

SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions

TL;DR

SineKAN introduces a sinusoidal-activation-based Kolmogorov-Arnold Network (KAN) that replaces B-Spline edges with grids of learnable sine functions. The layer output follows , with frequencies , phases , and amplitudes learned on a grid, alongside a phase-scaling initialization to stabilize deep models. Empirically, SineKAN achieves higher MNIST accuracy and substantially faster inference than B-SplineKAN and FourierKAN, while maintaining competitive performance with MLPs under optimized conditions; it also demonstrates partial resistance to catastrophic forgetting in continual learning and favorable generalization of periodic patterns. The work discusses practical scaling, limitations (e.g., grid expandability, symbolic expressions), and potential for integration into larger architectures, highlighting SineKAN as a fast, scalable alternative within the KAN framework.

Abstract

Recent work has established an alternative to traditional multi-layer perceptron neural networks in the form of Kolmogorov-Arnold Networks (KAN). The general KAN framework uses learnable activation functions on the edges of the computational graph followed by summation on nodes. The learnable edge activation functions in the original implementation are basis spline functions (B-Spline). Here, we present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions (SineKAN). We evaluate numerical performance of our model on a benchmark vision task. We show that our model can perform better than or comparable to B-Spline KAN models and an alternative KAN implementation based on periodic cosine and sine functions representing a Fourier Series. Further, we show that SineKAN has numerical accuracy that could scale comparably to dense neural networks (DNNs). Compared to the two baseline KAN models, SineKAN achieves a substantial speed increase at all hidden layer sizes, batch sizes, and depths. Current advantage of DNNs due to hardware and software optimizations are discussed along with theoretical scaling. Additionally, properties of SineKAN compared to other KAN implementations and current limitations are also discussed
Paper Structure (18 sections, 29 equations, 11 figures, 2 tables)

This paper contains 18 sections, 29 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Flow of operations. Top: MLP Bottom: KAN
  • Figure 2: Value of $\sum_{i=1}^g \sin \left(x + \frac{i}{g+1} \right)$ as a function of x with the ratio of sum at g+1 over the sum at g as the color scale. Left to right: $g=2$, $g=10$, $g=20$
  • Figure 3: $\sum_{k=1}^g \sin\left(x + \frac{k\pi}{g+1} R(g)\right)$ with the ratio of sum at $g+1$ over the sum at g as the color scale. Left to right: $g=2$, $g=10$, $g=20$
  • Figure 4: (a) Outputs of layers of same size (N=1000) with the recursive function applied for grid size scaling. (b) Outputs of layers of same size (N=1000) without the recursive function applied for grid size scaling.
  • Figure 5: (a) Outputs of consecutive layers of different sizes in a SineKAN model. (b) Outputs of consecutive layers of same size in a SineKAN model. (c) Outputs of consecutive layers of same size in a B-SplineKAN model.
  • ...and 6 more figures