VIKIN: A Reconfigurable Accelerator for KANs and MLPs with Two-Stage Sparsity Support
Wenhui Ou, Zhuoyu Wu, Yipu Zhang, Zheng Wang, C. Patrick Yue
TL;DR
VIKIN is presented, a reconfigurable accelerator that efficiently supports both KAN and MLP inference using unified hardware and introduces a pipeline execution mode and two-stage sparsity support for efficient KAN processing, while enabling parallel-mode acceleration to improve MLP throughput under the same sparsity framework.
Abstract
Recently, multi-layer perceptrons (MLPs) widely used in modern AI applications suffer from limited real-time performance due to intensive memory access overhead. Kolmogorov--Arnold Networks (KANs) have attracted increasing attention as an alternative architecture with similar structures to MLPs but improved parameter efficiency. However, the lack of dedicated hardware support limits the practical performance benefits of KANs. Moreover, since many edge workloads still rely heavily on MLPs, accelerators designed exclusively for KANs become inefficient and impractical. In this work, we present VIKIN, a reconfigurable accelerator that efficiently supports both KAN and MLP inference using unified hardware. VIKIN introduces a pipeline execution mode and two-stage sparsity support for efficient KAN processing, while enabling parallel-mode acceleration to improve MLP throughput under the same sparsity framework. Experiments on real-world datasets demonstrate that replacing MLPs with KANs on VIKIN achieves $1.28\times$ acceleration with $19.58\%$ reduced accuracy loss. For a higher-accuracy KAN model requiring $3.29\times$ more operations, VIKIN incurs only $1.24\times$ latency overhead compared with the baseline KAN model. In addition, VIKIN achieves $1.25\times$ speedup and $4.87\times$ higher energy efficiency than a representative edge GPU when executing KAN workloads.
