Lookup multivariate Kolmogorov-Arnold Networks
Sergey Pozdnyakov, Philippe Schwaller
TL;DR
This paper introduces lookup multivariate Kolmogorov-Arnold Networks (lmKANs), a drop-in replacement for high-dimensional linear mappings that replaces dense weight matrices with trainable, low-dimensional multivariate spline functions implemented as lookup tables. By encoding inner functions with 2D B-splines on a carefully designed sigma/grid and efficient CUDA kernels, lmKANs achieve substantial reductions in inference FLOPs while preserving expressive capacity, yielding up to 6× fewer FLOPs at matched accuracy and dramatically higher per-parameter efficiency on GPUs. Empirically, lmKANs excel in general function approximation, enable >10× H100 throughput on methane-like tabular data, and deliver 1.6–2.1× FLOPs savings for CNNs on CIFAR-10 and ~1.7× on ImageNet, all while maintaining accuracy. The work also provides Hessian-based regularization and a multistage fitting procedure to stabilize training, and includes extensive ablations and a comparison with FastKAN to underscore the robustness and practicality of the approach. Overall, lmKANs offer a scalable, hardware-friendly alternative for large-scale models where high-dimensional linear mappings dominate computational budgets, with broad applicability across MLPs and CNNs.
Abstract
High-dimensional linear mappings, or linear layers, dominate both the parameter count and the computational cost of most modern deep-learning models. We introduce a general-purpose drop-in replacement, lookup multivariate Kolmogorov-Arnold Networks (lmKANs), which deliver a substantially better trade-off between capacity and inference cost. Our construction expresses a general high-dimensional mapping through trainable low-dimensional multivariate functions. These functions can carry dozens or hundreds of trainable parameters each, and yet it takes only a few multiplications to compute them because they are implemented as spline lookup tables. Empirically, lmKANs reduce inference FLOPs by up to 6.0x while matching the flexibility of MLPs in general high-dimensional function approximation. In another feedforward fully connected benchmark, on the tabular-like dataset of randomly displaced methane configurations, lmKANs enable more than 10x higher H100 throughput at equal accuracy. Within frameworks of Convolutional Neural Networks, lmKAN-based CNNs cut inference FLOPs at matched accuracy by 1.6-2.1x and by 1.7x on the CIFAR-10 and ImageNet-1k datasets, respectively. Our code, including dedicated CUDA kernels, is available online at https://github.com/schwallergroup/lmkan.
