Table of Contents
Fetching ...

Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation

Sidharth SS, Keerthana AR, Gokul R, Anas KP

TL;DR

This work addresses the limitations of traditional neural networks in nonlinear function approximation by introducing Chebyshev KAN, a network that places learnable edge-wise activations parametrized by Chebyshev polynomials within the Kolmogorov-Arnold framework. The approach yields parameter-efficient models with dynamic activation functions and improved interpretability and numerical stability, demonstrated across MNIST, synthetic function approximation, and fractal-type functions. Comprehensive ablations reveal optimal choices for initialization, polynomial degree, and Chebyshev type, supporting the method's robustness. Overall, Chebyshev KAN advances nonlinear approximation by unifying approximation theory with neural-network design, with promising implications for scientific and engineering applications including PDEs.

Abstract

Accurate approximation of complex nonlinear functions is a fundamental challenge across many scientific and engineering domains. Traditional neural network architectures, such as Multi-Layer Perceptrons (MLPs), often struggle to efficiently capture intricate patterns and irregularities present in high-dimensional functions. This paper presents the Chebyshev Kolmogorov-Arnold Network (Chebyshev KAN), a new neural network architecture inspired by the Kolmogorov-Arnold representation theorem, incorporating the powerful approximation capabilities of Chebyshev polynomials. By utilizing learnable functions parametrized by Chebyshev polynomials on the network's edges, Chebyshev KANs enhance flexibility, efficiency, and interpretability in function approximation tasks. We demonstrate the efficacy of Chebyshev KANs through experiments on digit classification, synthetic function approximation, and fractal function generation, highlighting their superiority over traditional MLPs in terms of parameter efficiency and interpretability. Our comprehensive evaluation, including ablation studies, confirms the potential of Chebyshev KANs to address longstanding challenges in nonlinear function approximation, paving the way for further advancements in various scientific and engineering applications.

Chebyshev Polynomial-Based Kolmogorov-Arnold Networks: An Efficient Architecture for Nonlinear Function Approximation

TL;DR

This work addresses the limitations of traditional neural networks in nonlinear function approximation by introducing Chebyshev KAN, a network that places learnable edge-wise activations parametrized by Chebyshev polynomials within the Kolmogorov-Arnold framework. The approach yields parameter-efficient models with dynamic activation functions and improved interpretability and numerical stability, demonstrated across MNIST, synthetic function approximation, and fractal-type functions. Comprehensive ablations reveal optimal choices for initialization, polynomial degree, and Chebyshev type, supporting the method's robustness. Overall, Chebyshev KAN advances nonlinear approximation by unifying approximation theory with neural-network design, with promising implications for scientific and engineering applications including PDEs.

Abstract

Accurate approximation of complex nonlinear functions is a fundamental challenge across many scientific and engineering domains. Traditional neural network architectures, such as Multi-Layer Perceptrons (MLPs), often struggle to efficiently capture intricate patterns and irregularities present in high-dimensional functions. This paper presents the Chebyshev Kolmogorov-Arnold Network (Chebyshev KAN), a new neural network architecture inspired by the Kolmogorov-Arnold representation theorem, incorporating the powerful approximation capabilities of Chebyshev polynomials. By utilizing learnable functions parametrized by Chebyshev polynomials on the network's edges, Chebyshev KANs enhance flexibility, efficiency, and interpretability in function approximation tasks. We demonstrate the efficacy of Chebyshev KANs through experiments on digit classification, synthetic function approximation, and fractal function generation, highlighting their superiority over traditional MLPs in terms of parameter efficiency and interpretability. Our comprehensive evaluation, including ablation studies, confirms the potential of Chebyshev KANs to address longstanding challenges in nonlinear function approximation, paving the way for further advancements in various scientific and engineering applications.
Paper Structure (33 sections, 24 equations, 7 figures, 3 tables)

This paper contains 33 sections, 24 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Visualization of the Chebyshev KAN model with 3 input features, degree 1, and output shape 1. The weights/coefficients are not shown in the picture.
  • Figure 2: Visualization of the Chebyshev polynomials of the first kind
  • Figure 3: Visualization of the Chebyshev polynomials of the second kind
  • Figure 4: Illustration of the Chebyshev Kolmogorov-Arnold Network (Chebyshev KAN) architecture. The input tensor $\mathbf{x}$ is transformed into the Chebyshev Polynomials tensor $\mathbf{T}$. This tensor is then multiplied by the Chebyshev Coefficients tensor cheby_coeffs to produce the output tensor $\mathbf{y}$.
  • Figure 5: Visualization of the Chebyshev-Kolmogorov-Arnold Network (Chebyshev-KAN) architecture used for the MNIST dataset.
  • ...and 2 more figures