Table of Contents
Fetching ...

Degree-Optimized Cumulative Polynomial Kolmogorov-Arnold Networks

Mathew Vanherreweghe, Lirandë Pira, Patrick Rebentrost

TL;DR

CP-KAN addresses the challenge of learning high-dimensional functions with limited data by integrating Chebyshev-polynomial activations into a Kolmogorov-Arnold network and reframing per-neuron degree selection as a QUBO optimization. This reduces the combinatorial burden to a single optimization per layer, enabling adaptive complexity without prohibitive cost. Empirically, CP-KAN achieves competitive regression performance with far fewer parameters, demonstrates robustness to input scaling, and gains theoretical grounding in financial time-series mean-reversion via Chebyshev expansions. The work highlights a principled direction for efficient neural architectures that blend classical approximation theory with discrete optimization, with potential extensions to online adaptation and quantum annealing.

Abstract

We introduce cumulative polynomial Kolmogorov-Arnold networks (CP-KAN), a neural architecture combining Chebyshev polynomial basis functions and quadratic unconstrained binary optimization (QUBO). Our primary contribution involves reformulating the degree selection problem as a QUBO task, reducing the complexity from $O(D^N)$ to a single optimization step per layer. This approach enables efficient degree selection across neurons while maintaining computational tractability. The architecture performs well in regression tasks with limited data, showing good robustness to input scales and natural regularization properties from its polynomial basis. Additionally, theoretical analysis establishes connections between CP-KAN's performance and properties of financial time series. Our empirical validation across multiple domains demonstrates competitive performance compared to several traditional architectures tested, especially in scenarios where data efficiency and numerical stability are important. Our implementation, including strategies for managing computational overhead in larger networks is available in Ref.~\citep{cpkan_implementation}.

Degree-Optimized Cumulative Polynomial Kolmogorov-Arnold Networks

TL;DR

CP-KAN addresses the challenge of learning high-dimensional functions with limited data by integrating Chebyshev-polynomial activations into a Kolmogorov-Arnold network and reframing per-neuron degree selection as a QUBO optimization. This reduces the combinatorial burden to a single optimization per layer, enabling adaptive complexity without prohibitive cost. Empirically, CP-KAN achieves competitive regression performance with far fewer parameters, demonstrates robustness to input scaling, and gains theoretical grounding in financial time-series mean-reversion via Chebyshev expansions. The work highlights a principled direction for efficient neural architectures that blend classical approximation theory with discrete optimization, with potential extensions to online adaptation and quantum annealing.

Abstract

We introduce cumulative polynomial Kolmogorov-Arnold networks (CP-KAN), a neural architecture combining Chebyshev polynomial basis functions and quadratic unconstrained binary optimization (QUBO). Our primary contribution involves reformulating the degree selection problem as a QUBO task, reducing the complexity from to a single optimization step per layer. This approach enables efficient degree selection across neurons while maintaining computational tractability. The architecture performs well in regression tasks with limited data, showing good robustness to input scales and natural regularization properties from its polynomial basis. Additionally, theoretical analysis establishes connections between CP-KAN's performance and properties of financial time series. Our empirical validation across multiple domains demonstrates competitive performance compared to several traditional architectures tested, especially in scenarios where data efficiency and numerical stability are important. Our implementation, including strategies for managing computational overhead in larger networks is available in Ref.~\citep{cpkan_implementation}.

Paper Structure

This paper contains 52 sections, 1 theorem, 16 equations, 8 figures, 13 tables, 1 algorithm.

Key Result

Theorem 1

For the mean-reverting SDE $dX_t = \theta(\mu - X_t)\,dt + \sigma\,dW_t,$ the infinitesimal generator $\mathcal{A}$ admits a Chebyshev expansion: with $|\lambda_n|\le C(1+n^2).$

Figures (8)

  • Figure 1: Single layer architecture of CP-KAN showing input features ($x_1,\ldots,x_n$), projection neurons ($\alpha$), Chebyshev polynomial transformations ($T_{d_i}$), and their linear combination for output prediction. Each projection neuron learns optimal weights and bias for its input transformation.
  • Figure 2: Performance comparison across KAN architectures on the Jane Street Market Prediction dataset. All CP-KAN variants achieve stronger $R^2$ values than other KAN architectures.
  • Figure 3: Comparison of training behavior and generalization between CP-KAN and MLPs across two regression tasks. CP-KAN shows more stable and consistent performance.
  • Figure 4: Training stability comparison between MLP and CP-KAN on the Jane Street Market Prediction dataset across varying learning rates.
  • Figure 5: Impact of model complexity on MNIST test accuracy for CP-KAN solved via QUBO (dashed) and integer programming (solid).
  • ...and 3 more figures

Theorems & Definitions (1)

  • Theorem 1: Generator Decomposition infinitesimalGen2007