Kolmogorov-Arnold Network Autoencoders
Mohammadamin Moradi, Shirin Panahi, Erik Bollt, Ying-Cheng Lai
TL;DR
The paper investigates Kolmogorov-Arnold Networks (KANs) as edge-activated alternatives to MLPs for autoencoding tasks. It formalizes KANs using the Kolmogorov-Arnold representation $f(x_1, x_2, \dots, x_n) = \sum_{i=1}^{2n-1} \phi_i \left( \sum_{j=1}^n \psi_{ij}(x_j) \right)$ with spline-based univariate functions and details the associated parameter count $N_a$ and $(G+K)$ spline coefficients. AE-KAN models are evaluated on MNIST, SVHN, and CIFAR-10, showing competitive reconstruction errors and improved latent-space discrimination when assessed with a KNN classifier on the learned representations. The study notes higher capacity and training cost for KANs and discusses interpretability, suggesting future work on real-world applications and hybrid architectures, along with formal interpretability metrics.
Abstract
Deep learning models have revolutionized various domains, with Multi-Layer Perceptrons (MLPs) being a cornerstone for tasks like data regression and image classification. However, a recent study has introduced Kolmogorov-Arnold Networks (KANs) as promising alternatives to MLPs, leveraging activation functions placed on edges rather than nodes. This structural shift aligns KANs closely with the Kolmogorov-Arnold representation theorem, potentially enhancing both model accuracy and interpretability. In this study, we explore the efficacy of KANs in the context of data representation via autoencoders, comparing their performance with traditional Convolutional Neural Networks (CNNs) on the MNIST, SVHN, and CIFAR-10 datasets. Our results demonstrate that KAN-based autoencoders achieve competitive performance in terms of reconstruction accuracy, thereby suggesting their viability as effective tools in data analysis tasks.
