Rethinking the Function of Neurons in KANs

Mohammed Ghaith Altarabichi

Rethinking the Function of Neurons in KANs

Mohammed Ghaith Altarabichi

TL;DR

This work questions the default use of a sum as the neuron-level operation in Kolmogorov-Arnold Networks (KANs) and investigates whether a different multivariate function can improve practical utility in high-dimensional settings. The authors systematically compare nine multivariate neuron functions on ten classification datasets using a two-layer KAN, and identify mean as yielding the strongest and most consistent performance, relating its benefits to keeping activations within the effective range of spline activations and to the Kolmogorov-Arnold representation theorem, which expresses any continuous multivariate function $f(x_1, ldots,x_n)$ as $f = \sum_{q=1}^{2n+1} \Phi_q \left( \sum_{p=1}^n \phi_{q,p}(x_p) \right)$. They demonstrate that mean-based KANs improve both accuracy and training stability relative to a standard sum-based KAN and even outperform KAN variants with Layer Normalization on several datasets, while maintaining robust performance. The study provides a practical design guidance for KANs, showing that a simple averaging operation can align with the theoretical underpinnings and yield tangible gains for tabular data and potential extensions to other KAN architectures.

Abstract

The neurons of Kolmogorov-Arnold Networks (KANs) perform a simple summation motivated by the Kolmogorov-Arnold representation theorem, which asserts that sum is the only fundamental multivariate function. In this work, we investigate the potential for identifying an alternative multivariate function for KAN neurons that may offer increased practical utility. Our empirical research involves testing various multivariate functions in KAN neurons across a range of benchmark Machine Learning tasks. Our findings indicate that substituting the sum with the average function in KAN neurons results in significant performance enhancements compared to traditional KANs. Our study demonstrates that this minor modification contributes to the stability of training by confining the input to the spline within the effective range of the activation function. Our implementation and experiments are available at: \url{https://github.com/Ghaith81/dropkan}

Rethinking the Function of Neurons in KANs

TL;DR

. They demonstrate that mean-based KANs improve both accuracy and training stability relative to a standard sum-based KAN and even outperform KAN variants with Layer Normalization on several datasets, while maintaining robust performance. The study provides a practical design guidance for KANs, showing that a simple averaging operation can align with the theoretical underpinnings and yield tangible gains for tabular data and potential extensions to other KAN architectures.

Abstract

Paper Structure (10 sections, 3 equations, 1 figure, 4 tables)

This paper contains 10 sections, 3 equations, 1 figure, 4 tables.

Introduction
Background
Kolmogorov-Arnold representation theorem
Kolmogorov-Arnold Networks
Empirical Study and Discussion
Experimental Setup
Experiment I - Evaluation of Different Neuron Functions
Experiment II - Mean vs Sum as a Neuron Function
Related Work
Conclusions

Figures (1)

Figure 1: The percentage of KAN intermediate layer neurons' outputs within the effective grid range of [-1.0, +1.0], the datasets are ordered from left to right by ascending number of features.

Rethinking the Function of Neurons in KANs

TL;DR

Abstract

Rethinking the Function of Neurons in KANs

Authors

TL;DR

Abstract

Table of Contents

Figures (1)