SineKAN: Kolmogorov-Arnold Networks Using Sinusoidal Activation Functions
Eric A. F. Reinhardt, P. R. Dinesh, Sergei Gleyzer
TL;DR
SineKAN introduces a sinusoidal-activation-based Kolmogorov-Arnold Network (KAN) that replaces B-Spline edges with grids of learnable sine functions. The layer output follows $y_i = \sum_{j,k} A_{ijk} \sin(\omega_k x_j + \phi_{jk}) + b_i$, with frequencies $\omega_k$, phases $\phi_{jk}$, and amplitudes $A_{ijk}$ learned on a grid, alongside a phase-scaling initialization to stabilize deep models. Empirically, SineKAN achieves higher MNIST accuracy and substantially faster inference than B-SplineKAN and FourierKAN, while maintaining competitive performance with MLPs under optimized conditions; it also demonstrates partial resistance to catastrophic forgetting in continual learning and favorable generalization of periodic patterns. The work discusses practical scaling, limitations (e.g., grid expandability, symbolic expressions), and potential for integration into larger architectures, highlighting SineKAN as a fast, scalable alternative within the KAN framework.
Abstract
Recent work has established an alternative to traditional multi-layer perceptron neural networks in the form of Kolmogorov-Arnold Networks (KAN). The general KAN framework uses learnable activation functions on the edges of the computational graph followed by summation on nodes. The learnable edge activation functions in the original implementation are basis spline functions (B-Spline). Here, we present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions (SineKAN). We evaluate numerical performance of our model on a benchmark vision task. We show that our model can perform better than or comparable to B-Spline KAN models and an alternative KAN implementation based on periodic cosine and sine functions representing a Fourier Series. Further, we show that SineKAN has numerical accuracy that could scale comparably to dense neural networks (DNNs). Compared to the two baseline KAN models, SineKAN achieves a substantial speed increase at all hidden layer sizes, batch sizes, and depths. Current advantage of DNNs due to hardware and software optimizations are discussed along with theoretical scaling. Additionally, properties of SineKAN compared to other KAN implementations and current limitations are also discussed
