Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation

Sidharth SS; Keerthana AR; Gokul R; Anas KP

Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation

Sidharth SS, Keerthana AR, Gokul R, Anas KP

TL;DR

This work addresses the limitations of traditional neural networks in nonlinear function approximation by introducing Chebyshev KAN, a network that places learnable edge-wise activations parametrized by Chebyshev polynomials within the Kolmogorov-Arnold framework. The approach yields parameter-efficient models with dynamic activation functions and improved interpretability and numerical stability, demonstrated across MNIST, synthetic function approximation, and fractal-type functions. Comprehensive ablations reveal optimal choices for initialization, polynomial degree, and Chebyshev type, supporting the method's robustness. Overall, Chebyshev KAN advances nonlinear approximation by unifying approximation theory with neural-network design, with promising implications for scientific and engineering applications including PDEs.

Abstract

Accurate approximation of complex nonlinear functions is a fundamental challenge across many scientific and engineering domains. Traditional neural network architectures, such as Multi-Layer Perceptrons (MLPs), often struggle to efficiently capture intricate patterns and irregularities present in high-dimensional functions. This paper presents the Chebyshev Kolmogorov-Arnold Network (Chebyshev KAN), a new neural network architecture inspired by the Kolmogorov-Arnold representation theorem, incorporating the powerful approximation capabilities of Chebyshev polynomials. By utilizing learnable functions parametrized by Chebyshev polynomials on the network's edges, Chebyshev KANs enhance flexibility, efficiency, and interpretability in function approximation tasks. We demonstrate the efficacy of Chebyshev KANs through experiments on digit classification, synthetic function approximation, and fractal function generation, highlighting their superiority over traditional MLPs in terms of parameter efficiency and interpretability. Our comprehensive evaluation, including ablation studies, confirms the potential of Chebyshev KANs to address longstanding challenges in nonlinear function approximation, paving the way for further advancements in various scientific and engineering applications.

Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation

TL;DR

Abstract

Paper Structure (33 sections, 24 equations, 7 figures, 3 tables)

This paper contains 33 sections, 24 equations, 7 figures, 3 tables.

Introduction
Kolmogorov-Arnold Theorem
Chebyshev Polynomials
The Chebyshev Kolmogorov-Arnold Network
The Chebyshev Kolmogorov-Arnold Network
Chebyshev Polynomial Representation
Learnable Chebyshev Coefficients
Network Computation
Mathematical Explanation
Advantages over Traditional MLPs
Parameter Efficiency
Dynamic Activation Functions
Enhanced Interpretability
Enhanced Interpretability
Improved Numerical Stability and Approximation Accuracy
...and 18 more sections

Figures (7)

Figure 1: Visualization of the Chebyshev KAN model with 3 input features, degree 1, and output shape 1. The weights/coefficients are not shown in the picture.
Figure 2: Visualization of the Chebyshev polynomials of the first kind
Figure 3: Visualization of the Chebyshev polynomials of the second kind
Figure 4: Illustration of the Chebyshev Kolmogorov-Arnold Network (Chebyshev KAN) architecture. The input tensor $\mathbf{x}$ is transformed into the Chebyshev Polynomials tensor $\mathbf{T}$. This tensor is then multiplied by the Chebyshev Coefficients tensor cheby_coeffs to produce the output tensor $\mathbf{y}$.
Figure 5: Visualization of the Chebyshev-Kolmogorov-Arnold Network (Chebyshev-KAN) architecture used for the MNIST dataset.
...and 2 more figures

Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation

TL;DR

Abstract

Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)