Convolutional Kolmogorov-Arnold Networks
Alexander Dylan Bodner, Antonio Santiago Tepsich, Jack Natan Spolski, Santiago Pourteau
TL;DR
The paper tackles parameter efficiency in vision models by introducing Convolutional Kolmogorov-Arnold Networks, which replace fixed convolutional weights with learnable spline-based activations (B-splines) within convolutional kernels. The authors design KAN Convolutions, discuss grid extension to keep splines well-behaved, and analyze the parameter implications, demonstrating competitive Fashion-MNIST performance with substantially fewer parameters in several configurations. They provide a thorough experimental comparison across architectures and highlight both potential gains in expressivity and practical challenges, such as slower training times due to non-GPU-parallelizable spline computations. The work establishes Convolutional KANs as a promising, parameter-efficient alternative to standard CNNs and outlines clear directions for improving scalability and interpretability in future research.
Abstract
In this paper, we present Convolutional Kolmogorov-Arnold Networks, a novel architecture that integrates the learnable spline-based activation functions of Kolmogorov-Arnold Networks (KANs) into convolutional layers. By replacing traditional fixed-weight kernels with learnable non-linear functions, Convolutional KANs offer a significant improvement in parameter efficiency and expressive power over standard Convolutional Neural Networks (CNNs). We empirically evaluate Convolutional KANs on the Fashion-MNIST dataset, demonstrating competitive accuracy with up to 50% fewer parameters compared to baseline classic convolutions. This suggests that the KAN Convolution can effectively capture complex spatial relationships with fewer resources, offering a promising alternative for parameter-efficient deep learning models.
