QuantKAN: A Unified Quantization Framework for Kolmogorov Arnold Networks
Kazi Ahmed Asif Fuad, Lizhong Chen
TL;DR
QuantKAN presents the first unified framework for quantizing Kolmogorov–Arnold Networks by adapting both quantization-aware training and post-training quantization to the dual-branch spline-based architecture. The framework explicitly quantizes base weights, spline coefficients, and activations with branch-aware wrappers, enabling stable training and robust low-bit inference across MNIST and CIFAR datasets and multiple KAN variants. Comprehensive experiments reveal that shallow KANs can reach near full-precision accuracy at 4-bit with LSQ/LSQ+ or PACT, while deeper KAGN models benefit from DoReFa's stability; GPTQ and Uniform generally provide the strongest PTQ performance, with activation quantization remaining a key bottleneck. These findings offer practical guidelines for deploying spline-based networks in resource-constrained settings and motivate further co-design of quantizers and architectures for efficient hardware acceleration.
Abstract
Kolmogorov Arnold Networks (KANs) represent a new class of neural architectures that replace conventional linear transformations and node-based nonlinearities with spline-based function approximations distributed along network edges. Although KANs offer strong expressivity and interpretability, their heterogeneous spline and base branch parameters hinder efficient quantization, which remains unexamined compared to CNNs and Transformers. In this paper, we present QuantKAN, a unified framework for quantizing KANs across both quantization aware training (QAT) and post-training quantization (PTQ) regimes. QuantKAN extends modern quantization algorithms, such as LSQ, LSQ+, PACT, DoReFa, QIL, GPTQ, BRECQ, AdaRound, AWQ, and HAWQ-V2, to spline based layers with branch-specific quantizers for base, spline, and activation components. Through extensive experiments on MNIST, CIFAR 10, and CIFAR 100 across multiple KAN variants (EfficientKAN, FastKAN, PyKAN, and KAGN), we establish the first systematic benchmarks for low-bit spline networks. Our results show that KANs, particularly deeper KAGN variants, are compatible with low-bit quantization but exhibit strong method architecture interactions: LSQ, LSQ+, and PACT preserve near full precision accuracy at 4 bit for shallow KAN MLP and ConvNet models, while DoReFa provides the most stable behavior for deeper KAGN under aggressive low-bit settings. For PTQ, GPTQ and Uniform consistently deliver the strongest overall performance across datasets, with BRECQ highly competitive on simpler regimes such as MNIST. Our proposed QuantKAN framework thus unifies spline learning and quantization, and provides practical tools and guidelines for efficiently deploying KANs in real-world, resource-constrained environments.
