Gaussian Process Kolmogorov-Arnold Networks
Andrew Siyuan Chen
TL;DR
GP-KAN addresses the challenge of uncertainty propagation in deep Gaussian process architectures by introducing GP neurons as the univariate non-linear units in a Kolmogorov-Arnold Network. It achieves exact propagation of Gaussian distributions through depth by collapsing the input distribution via inner products with GP function samples, enabling analytic mean and variance calculations. The approach yields two practical layers, Fully-Connected GP (FCGP) and Convolutional GP (ConvGP), with normalization and activation strategies that preserve Gaussianity. Empirical validation on MNIST shows competitive accuracy with far fewer parameters than state-of-the-art methods, highlighting a potentially efficient and uncertainty-aware alternative for image classification and other domains.
Abstract
In this paper, we introduce a probabilistic extension to Kolmogorov Arnold Networks (KANs) by incorporating Gaussian Process (GP) as non-linear neurons, which we refer to as GP-KAN. A fully analytical approach to handling the output distribution of one GP as an input to another GP is achieved by considering the function inner product of a GP function sample with the input distribution. These GP neurons exhibit robust non-linear modelling capabilities while using few parameters and can be easily and fully integrated in a feed-forward network structure. They provide inherent uncertainty estimates to the model prediction and can be trained directly on the log-likelihood objective function, without needing variational lower bounds or approximations. In the context of MNIST classification, a model based on GP-KAN of 80 thousand parameters achieved 98.5% prediction accuracy, compared to current state-of-the-art models with 1.5 million parameters.
