Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability
Liangwewi Nathan Zheng, Wei Emma Zhang, Lin Yue, Miao Xu, Olaf Maennel, Weitong Chen
TL;DR
This work tackles fixed-knot limitations, excessive parameter counts, and training instability in Kolmogorov-Arnold Networks (KANs) by deriving knot-count bounds and proposing Free-Knots KAN (FR-KAN). FR-KAN combines neuron grouping with weight sharing, free grid shifts, and a $C^2$-continuity training strategy to reduce parameters to the scale of standard MLPs while enabling more flexible activations. The authors validate FR-KAN across image, text, time-series, multimodal, and function-approximation tasks, showing competitive or superior performance and enhanced stability over vanilla KAN and MLP baselines, with interpretable learned activations and a wider activation field from larger grids. Overall, the approach provides practical guidance for scalable KAN deployment and opens avenues for further efficiency gains in spline-based neural architectures.
Abstract
Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community. However, their implementation often suffers from poor training stability and heavy trainable parameter. Furthermore, there is limited understanding of the behavior of the learned activation functions derived from B-splines. In this work, we analyze the behavior of KANs through the lens of spline knots and derive the lower and upper bound for the number of knots in B-spline-based KANs. To address existing limitations, we propose a novel Free Knots KAN that enhances the performance of the original KAN while reducing the number of trainable parameters to match the trainable parameter scale of standard Multi-Layer Perceptrons (MLPs). Additionally, we introduce new a training strategy to ensure $C^2$ continuity of the learnable spline, resulting in smoother activation compared to the original KAN and improve the training stability by range expansion. The proposed method is comprehensively evaluated on 8 datasets spanning various domains, including image, text, time series, multimodal, and function approximation tasks. The promising results demonstrates the feasibility of KAN-based network and the effectiveness of proposed method.
