PRKAN: Parameter-Reduced Kolmogorov-Arnold Networks
Hoang-Thang Ta, Duy-Quy Thai, Anh Tran, Grigori Sidorov, Alexander Gelbukh
TL;DR
The paper tackles the high parameter cost of Kolmogorov-Arnold Networks (KANs) and introduces PRKAN, a Parameter-Reduced KAN, to align parameter counts with MLPs. PRKAN integrates attention, convolutional components, dimension summation, and feature-weight vectors, along with data normalization, to compress KAN layers without altering the overall network structure. On MNIST and Fashion-MNIST, PRKAN variants—especially with attention and layer normalization—achieve competitive validation accuracy relative to MLPs while maintaining similar parameter budgets, with GRBFs often providing faster, more accurate results than B-splines. The work demonstrates that KANs can be made efficient and competitive, offering a pathway to lightweight KANs for image tasks and beyond.
Abstract
Kolmogorov-Arnold Networks (KANs) represent an innovation in neural network architectures, offering a compelling alternative to Multi-Layer Perceptrons (MLPs) in models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformers. By advancing network design, KANs drive groundbreaking research and enable transformative applications across various scientific domains involving neural networks. However, existing KANs often require significantly more parameters in their network layers than MLPs. To address this limitation, this paper introduces PRKANs (Parameter-Reduced Kolmogorov-Arnold Networks), which employ several methods to reduce the parameter count in KAN layers, making them comparable to MLP layers. Experimental results on the MNIST and Fashion-MNIST datasets demonstrate that PRKANs outperform several existing KANs, and their variant with attention mechanisms rivals the performance of MLPs, albeit with slightly longer training times. Furthermore, the study highlights the advantages of Gaussian Radial Basis Functions (GRBFs) and layer normalization in KAN designs. The repository for this work is available at: https://github.com/hoangthangta/All-KAN.
