Table of Contents
Fetching ...

Automatic Grid Updates for Kolmogorov-Arnold Networks using Layer Histograms

Jamison Moody, James Usevitch

TL;DR

The paper addresses the need for automatic domain-grid updates in Kolmogorov-Arnold Networks (KANs) and the challenge of reliable OOD detection. It introduces AdaptKAN, which augments KANs with exponential-moving-average histograms to auto-stretch or shrink the B-spline activation domain based on data distributions, and adds a post-hoc OOD scoring mechanism using layer histograms. The method eliminates manual timing for grid adaptation, refits weights after domain changes, and provides a memory-efficient, architecture-agnostic OOD detector. Across tasks including learning symbolic equations (Feynman dataset), image feature classification, and learning a control Lyapunov function, AdaptKAN matches or exceeds prior KAN and MLP performance while improving robustness to outliers and data poisoning.

Abstract

Kolmogorov-Arnold Networks (KANs) are a class of neural networks that have received increased attention in recent literature. In contrast to MLPs, KANs leverage parameterized, trainable activation functions and offer several benefits including improved interpretability and higher accuracy on learning symbolic equations. However, the original KAN architecture requires adjustments to the domain discretization of the network (called the "domain grid") during training, creating extra overhead for the user in the training process. Typical KAN layers are not designed with the ability to autonomously update their domains in a data-driven manner informed by the changing output ranges of previous layers. As an added benefit, this histogram algorithm may also be applied towards detecting out-of-distribution (OOD) inputs in a variety of settings. We demonstrate that AdaptKAN exceeds or matches the performance of prior KAN architectures and MLPs on four different tasks: learning scientific equations from the Feynman dataset, image classification from frozen features, learning a control Lyapunov function, and detecting OOD inputs on the OpenOOD v1.5 benchmark.

Automatic Grid Updates for Kolmogorov-Arnold Networks using Layer Histograms

TL;DR

The paper addresses the need for automatic domain-grid updates in Kolmogorov-Arnold Networks (KANs) and the challenge of reliable OOD detection. It introduces AdaptKAN, which augments KANs with exponential-moving-average histograms to auto-stretch or shrink the B-spline activation domain based on data distributions, and adds a post-hoc OOD scoring mechanism using layer histograms. The method eliminates manual timing for grid adaptation, refits weights after domain changes, and provides a memory-efficient, architecture-agnostic OOD detector. Across tasks including learning symbolic equations (Feynman dataset), image feature classification, and learning a control Lyapunov function, AdaptKAN matches or exceeds prior KAN and MLP performance while improving robustness to outliers and data poisoning.

Abstract

Kolmogorov-Arnold Networks (KANs) are a class of neural networks that have received increased attention in recent literature. In contrast to MLPs, KANs leverage parameterized, trainable activation functions and offer several benefits including improved interpretability and higher accuracy on learning symbolic equations. However, the original KAN architecture requires adjustments to the domain discretization of the network (called the "domain grid") during training, creating extra overhead for the user in the training process. Typical KAN layers are not designed with the ability to autonomously update their domains in a data-driven manner informed by the changing output ranges of previous layers. As an added benefit, this histogram algorithm may also be applied towards detecting out-of-distribution (OOD) inputs in a variety of settings. We demonstrate that AdaptKAN exceeds or matches the performance of prior KAN architectures and MLPs on four different tasks: learning scientific equations from the Feynman dataset, image classification from frozen features, learning a control Lyapunov function, and detecting OOD inputs on the OpenOOD v1.5 benchmark.

Paper Structure

This paper contains 23 sections, 26 equations, 4 figures, 6 tables, 2 algorithms.

Figures (4)

  • Figure 1: A high level overview of our proposed AdaptKAN architecture. AdaptKAN augments the original KAN architecture with histograms that approximate marginal distributions for each input feature to each layer. The number of bins is equal to the number of grid intervals for each B-spline. AdaptKAN stores two extra bins per histogram (depicted as solid black bins in the figure) that keep track of how much data is falling outside of the current grid domain. If the out-of-domain bin counts rise above a specified threshold, AdaptKAN stretches the domain. On the other hand, if the edge histogram bin counts fall below a specified threshold, AdaptKAN shrinks the domain. Further details on this adaptation algorithm are provided in Section \ref{['sec:main_results']}.
  • Figure 2: How auto-adapting works: we look at out-of-domain bins (high values stretch the domain grid) and edge bins (close to empty bins shrinks the domain grid). We keep the number of weights and bins fixed during the training process and instead adjust the domain bounds $a$ and $b$. We refit the weights exactly using linear least squares after expansion/shrinking. Another, faster option, is approximately refitting the weights by interpolating over the control points (weights). We refit the histograms using linear interpolation, and we scale the histogram counts to equal the batch size.
  • Figure 3: We plot the validation accuracy results vs. the parameter count of various hyperparameter sweeps for AdaptKAN (one layer), Adaptkan (2-3 layers), EfficientKAN (one layer), MLP (one layer) and MLP (2-3 layers) on the CIFAR-10 and CIFAR-100 classification tasks. For CIFAR-10, AdaptKAN achieves a higher validation accuracy while achieving a lower number of parameters compared to the MLP (2-3 layers) best performing run. For CIFAR-100, AdaptKAN also achieves the highest accuracy, with fewer parameters than the best performing EfficientKAN and MLP (2-3 layers) model.
  • Figure 4: Lyapunov function contours and trajectories for the analytical solution (a), AdaptKAN (b), and MLP (c). In each of the plots, we simulate 20 trajectories (starting at a grid of points over the domain) using the controller derived from each CLF. In the background of the plots, we show the contours of the CLF values for each method, with each axis referring to an input feature. In these specific examples, AdaptKAN qualitatively seems to learn a CLF closer to the analytical solution.