Table of Contents
Fetching ...

Architectural Scaling Surpass Basis Complexity? Efficient KANs with Single-Parameter Design

Zhijie Chen, Xinglin Zhang, Hongshu Guo, Yue-Jiao Gong

TL;DR

The paper tackles the lack of a unified theory for Kolmogorov-Arnold Networks (KANs) by introducing the Universal KAN (Uni-KAN) framework that unifies dense and sparse representations and supports an open-source library. It then proposes the Efficient KAN Expansion (EKE) hypothesis, arguing that fixed parameter budgets favor architectural scaling over increasing basis-function complexity, formalized as $N_p = N_c \cdot N_b$. Building on this, the authors present Single-Parameter KANs (SKANs), a family of ultra-lightweight networks where every edge basis has a single learnable parameter, with designed bases such as LSS, LArctan, and LSin. Across MNIST, differential equations, and medical image segmentation, SKANs demonstrate strong, cross-domain performance improvements (e.g., up to 6.51% F1 gains and up to 6x faster training on MNIST; 93.1% reduction in test loss for Neural ODE comparisons; notable Dice improvements with substantial parameter reductions), providing empirical support for the central claim that basis-function smoothness and architectural scaling yield superior efficiency. Collectively, Uni-KAN, EKE, and SKAN establish a cohesive framework and practical methodology for designing next-generation, parameter-efficient neural networks with broad applicability.

Abstract

The landscape of Kolmogorov-Arnold Networks (KANs) is rapidly expanding, yet lacks a unified theoretical framework and a clear principle for efficient architecture design. This paper addresses these gaps with three core contributions. First, we introduce the Universal KAN (Uni-KAN) framework, a novel abstraction that formally unifies all KAN-style networks through dense and sparse representations. We prove their interchangeability and provide an open-source library for this framework, facilitating future research. Second, we propose the Efficient KAN Expansion (EKE) Hypothesis, a design philosophy positing that allocating parameters to architectural scaling rather than basis function complexity yields superior performance. Third, we present Single-Parameter KANs (SKANs), a family of ultra-lightweight networks that embody the EKE Hypothesis. Our comprehensive experiments provide the first strong empirical validation for the theoretical necessity of basis function smoothness for stable training. Furthermore, SKANs demonstrate state-of-the-art performance, improving F1 scores by up to 6.51\% and reducing test loss by 93.1\%, while achieving up to 6x faster training speeds compared to existing KAN variants. These results establish a robust framework, a guiding hypothesis, and a practical methodology for designing the next generation of efficient and powerful neural networks. The code is accessible at https://anonymous.4open.science/r/SKAN-EBBB/.

Architectural Scaling Surpass Basis Complexity? Efficient KANs with Single-Parameter Design

TL;DR

The paper tackles the lack of a unified theory for Kolmogorov-Arnold Networks (KANs) by introducing the Universal KAN (Uni-KAN) framework that unifies dense and sparse representations and supports an open-source library. It then proposes the Efficient KAN Expansion (EKE) hypothesis, arguing that fixed parameter budgets favor architectural scaling over increasing basis-function complexity, formalized as . Building on this, the authors present Single-Parameter KANs (SKANs), a family of ultra-lightweight networks where every edge basis has a single learnable parameter, with designed bases such as LSS, LArctan, and LSin. Across MNIST, differential equations, and medical image segmentation, SKANs demonstrate strong, cross-domain performance improvements (e.g., up to 6.51% F1 gains and up to 6x faster training on MNIST; 93.1% reduction in test loss for Neural ODE comparisons; notable Dice improvements with substantial parameter reductions), providing empirical support for the central claim that basis-function smoothness and architectural scaling yield superior efficiency. Collectively, Uni-KAN, EKE, and SKAN establish a cohesive framework and practical methodology for designing next-generation, parameter-efficient neural networks with broad applicability.

Abstract

The landscape of Kolmogorov-Arnold Networks (KANs) is rapidly expanding, yet lacks a unified theoretical framework and a clear principle for efficient architecture design. This paper addresses these gaps with three core contributions. First, we introduce the Universal KAN (Uni-KAN) framework, a novel abstraction that formally unifies all KAN-style networks through dense and sparse representations. We prove their interchangeability and provide an open-source library for this framework, facilitating future research. Second, we propose the Efficient KAN Expansion (EKE) Hypothesis, a design philosophy positing that allocating parameters to architectural scaling rather than basis function complexity yields superior performance. Third, we present Single-Parameter KANs (SKANs), a family of ultra-lightweight networks that embody the EKE Hypothesis. Our comprehensive experiments provide the first strong empirical validation for the theoretical necessity of basis function smoothness for stable training. Furthermore, SKANs demonstrate state-of-the-art performance, improving F1 scores by up to 6.51\% and reducing test loss by 93.1\%, while achieving up to 6x faster training speeds compared to existing KAN variants. These results establish a robust framework, a guiding hypothesis, and a practical methodology for designing the next generation of efficient and powerful neural networks. The code is accessible at https://anonymous.4open.science/r/SKAN-EBBB/.

Paper Structure

This paper contains 9 sections, 1 theorem, 10 equations, 10 figures, 3 tables.

Key Result

Proposition 1

Any Sparse KAN, $\mathcal{K}_{sparse}$, with an arbitrary topology can be exactly represented by a Dense KAN, $\mathcal{K}_{dense}$, paired with a static binary mask, $M$.

Figures (10)

  • Figure 1: Visualization of SKANs' architecture and its key components. Top left (a): Basis function from Spl-KAN, debate between simple and complex basis and the preliminary experimental results demonstrating the EKE hypothesis. Bottom left (b): Visualization of a [2,2,2] SKAN network and bases of SKANs. Right (c): Part of SKANs' excellent performance compared to other models.
  • Figure 2: Comprehensive performance evaluation of Spl-KAN with varying grid sizes across a wide spectrum of learning rates. The analysis encompasses: (a) training loss, (b) test loss, (c) training accuracy, (d) test accuracy, and (e) F1 score. The results consistently identify an optimal performance window within the [0.001, 0.01] learning rate range, guiding subsequent focused analysis.
  • Figure 3: Detailed performance analysis within the optimal learning rate range [0.001, 0.01]. For visual clarity, only grid sizes 1 through 5 are shown. These plots reveal a systematic performance advantage for smaller grid sizes (e.g., g=1, g=2) across all five key metrics, providing strong evidence for the EKE hypothesis.
  • Figure 4: The trade-off between basis function complexity and model performance in Spl-KAN, evaluated for grid sizes 1 through 10. The inverse correlation between mean F1 score and grid size, juxtaposed with the direct correlation between basis parameters and grid size, offers a compelling empirical validation of the EKE hypothesis.
  • Figure 5: Architectural primitives of the Uni-KAN framework. The network consists of layers formed by nodes and edges. We define the sublayer (e.g., $S_{1,2:2}$, circled) as the minimal computational unit, fundamental for parallelized computation.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Definition 1: Generalized KAN
  • Definition 2: Sparse and Dense KANs
  • Proposition 1: Uni-KAN Representation Equivalence
  • proof : Proof Sketch