Want to train KANS at scale? Now UKAN!

Alireza Moradzadeh; Srimukh Prasad Veccham; Lukasz Wawrzyniak; Miles Macklin; Saee G. Paliwal

Want to train KANS at scale? Now UKAN!

Alireza Moradzadeh, Srimukh Prasad Veccham, Lukasz Wawrzyniak, Miles Macklin, Saee G. Paliwal

TL;DR

This paper introduces Unbounded Kolmogorov-Arnold Networks (UKANs), which remove the traditional bounded-grid limitation of Kolmogorov-Arnold Networks (KANs) by using a coefficient-generator (CG) MLP to produce B-spline coefficients on an unbounded grid. UKANs couple with MLP-based positional encodings to provide local spline coefficients, enabling function approximation on unbounded domains without data normalization, while a GPU-accelerated warpKAN library speeds up B-spline evaluation and supports large-scale training. Empirical results across regression, classification, approximation, generation, and drug-discovery tasks show that UKANs match or surpass KAN performance, with substantial memory and compute savings (3–30x speedups and up to 1000x memory reductions). The work demonstrates practical scalability for molecular property prediction and other scientific domains, highlighting UKAN as a versatile building block for large-scale, spline-based neural architectures and pointing toward future directions like multi-GPU training and adaptive knot policies.

Abstract

Kolmogorov-Arnold Networks (KANs) have recently emerged as a powerful alternative to traditional multilayer perceptrons. However, their reliance on predefined, bounded grids restricts their ability to approximate functions on unbounded domains. To address this, we present Unbounded Kolmogorov-Arnold Networks (UKANs), a method that removes the need for bounded grids in traditional Kolmogorov-Arnold Networks (KANs). The key innovation of this method is a coefficient-generator (CG) model that produces, on the fly, only the B-spline coefficients required locally on an unbounded symmetric grid. UKANs couple multilayer perceptrons with KANs by feeding the positional encoding of grid groups into the CG model, enabling function approximation on unbounded domains without requiring data normalization. To reduce the computational cost of both UKANs and KANs, we introduce a GPU-accelerated library that lowers B-spline evaluation complexity by a factor proportional to the grid size, enabling large-scale learning by leveraging efficient memory management, in line with recent software advances such as FlashAttention and FlashFFTConv. Performance benchmarking confirms the superior memory and computational efficiency of our accelerated KAN (warpKAN), and UKANs, showing a 3-30x speed-up and up to 1000x memory reduction compared to vanilla KANs. Experiments on regression, classification, and generative tasks demonstrate the effectiveness of UKANs to match or surpass KAN accuracy. Finally, we use both accelerated KAN and UKAN in a molecular property prediction task, establishing the feasibility of large-scale end-to-end training with our optimized implementation.

Want to train KANS at scale? Now UKAN!

TL;DR

Abstract

Paper Structure (25 sections, 24 equations, 5 figures, 8 tables)

This paper contains 25 sections, 24 equations, 5 figures, 8 tables.

Background
Algorithm
Experiments
Performance Benchmarking
Compute model
Memory models
Tasks
Regression
Classification
Approximation
Generation
Real-world Application: Drug Discovery
conclusion
Appendix
Cubic B-Spline Basis Matrix Representation
...and 10 more sections

Figures (5)

Figure 1: The UKAN model architecture including grid group positional encoding, coefficient-generator MLP, and B-spline function.
Figure 2: warpKAN vs. torchKAN. (a) Increasing B-spline order yields larger speedups ($5.5\!\times$–$15\!\times$). (b) Increasing grid size shows average $\sim\!12\!\times$ and up to $24\!\times$ speedup; torchKAN hits OOM $\ge 256$ knots while warpKAN scales to $2^{18}$. All values are normalized to the public PyTorch implementation.
Figure 3: Regression task results. (a) RMSE vs. training epochs for Function I using KAN, UKAN, and $\mathrm{MLP}(2\!\to\!5\!\to\!1)$. (b) RMSE vs. training epochs for Function II using KAN, UKAN, and $\mathrm{MLP}(2\!\to\!5\!\to\!1)$. (c) RMSE vs. training epochs for Function III using KAN, UKAN, and $\mathrm{MLP}(16\!\to\!32\!\to\!1)$.
Figure 4: KAN and UKAN used in PINNs. Solving logistic growth model with both KAN and UKAN $(1\!\to\!5\to\!1)$ over domain of -5 and 5.
Figure 5: DDPM with KAN, UKAN, and MLP.

Want to train KANS at scale? Now UKAN!

TL;DR

Abstract

Want to train KANS at scale? Now UKAN!

Authors

TL;DR

Abstract

Table of Contents

Figures (5)