KANs for Computer Vision: An Experimental Study
Karthik Mohan, Hanxiao Wang, Xiatian Zhu
TL;DR
This study experimentally evaluates Kolmogorov-Arnold Networks (KANs) for computer vision, focusing on image classification and comparing against traditional MLP baselines. KANs replace fixed node activations with learnable edge-wise functions, typically B-splines, introducing hyperparameters Grid and Order that govern their flexibility. Results on MNIST, CIFAR-10, and Fashion-MNIST show KANs can match or slightly exceed MLP performance but require more parameters and are more sensitive to hyperparameters, limiting practicality as standalone CV models. Variants such as EfficientKAN and Convolutional KANs mitigate some computational and architectural challenges, and ConvKANs demonstrate notable accuracy gains, suggesting that hybrid or integrated designs hold more promise for scalable vision tasks.
Abstract
This paper presents an experimental study of Kolmogorov-Arnold Networks (KANs) applied to computer vision tasks, particularly image classification. KANs introduce learnable activation functions on edges, offering flexible non-linear transformations compared to traditional pre-fixed activation functions with specific neural work like Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs). While KANs have shown promise mostly in simplified or small-scale datasets, their effectiveness for more complex real-world tasks such as computer vision tasks remains less explored. To fill this gap, this experimental study aims to provide extended observations and insights into the strengths and limitations of KANs. We reveal that although KANs can perform well in specific vision tasks, they face significant challenges, including increased hyperparameter sensitivity and higher computational costs. These limitations suggest that KANs require architectural adaptations, such as integration with other architectures, to be practical for large-scale vision problems. This study focuses on empirical findings rather than proposing new methods, aiming to inform future research on optimizing KANs, in particular computer vision applications or alike.
