KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics

Benjamin C. Koenig; Suyong Kim; Sili Deng

KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics

Benjamin C. Koenig, Suyong Kim, Sili Deng

TL;DR

This work applies Kolmogorov-Arnold networks as the backbone of a neural ordinary differential equation (ODE) framework, generalizing their use to the time-dependent and temporal grid-sensitive cases often seen in dynamical systems and scientific machine learning applications.

Abstract

Kolmogorov-Arnold networks (KANs) as an alternative to multi-layer perceptrons (MLPs) are a recent development demonstrating strong potential for data-driven modeling. This work applies KANs as the backbone of a neural ordinary differential equation (ODE) framework, generalizing their use to the time-dependent and temporal grid-sensitive cases often seen in dynamical systems and scientific machine learning applications. The proposed KAN-ODEs retain the flexible dynamical system modeling framework of Neural ODEs while leveraging the many benefits of KANs compared to MLPs, including higher accuracy and faster neural scaling, stronger interpretability and generalizability, and lower parameter counts. First, we quantitatively demonstrated these improvements in a comprehensive study of the classical Lotka-Volterra predator-prey model. We then showcased the KAN-ODE framework's ability to learn symbolic source terms and complete solution profiles in higher-complexity and data-lean scenarios including wave propagation and shock formation, the complex Schrödinger equation, and the Allen-Cahn phase separation equation. The successful training of KAN-ODEs, and their improved performance compared to traditional Neural ODEs, implies significant potential in leveraging this novel network architecture in myriad scientific machine learning applications for discovering hidden physics and predicting dynamic evolution.

KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics

TL;DR

Abstract

Paper Structure (13 sections, 25 equations, 10 figures, 4 tables)

This paper contains 13 sections, 25 equations, 10 figures, 4 tables.

Introduction
Methodology
Kolmogorov Arnold Networks as Gradient Evaluators
KAN-Ordinary Differential Equations
Experiments
KAN-ODEs vs Neural ODEs: Extensive Comparison via Lotka-Volterra Equations
Benchmark Tests and Neural Scaling Behavior
Interpretation and Generalization of KAN-ODEs with Varying Sizes
Modeling Hidden Physics in PDEs: Fisher-KPP PDE
Data-Driven Solutions of PDEs
Burgers' Equation
Schrödinger Equation
Conclusions

Figures (10)

Figure 1: Qualitative depiction of KAN-ODE's general capability compared to similar machine learning techniques.
Figure 2: Schematic depicting the overall training cycle of a KAN-ODE. The loop in green leverages a KAN as a temporal gradient getter for the state vector to solve the ODE forward. Once a solution is generated, the blue loop computes the gradient of the loss function via the adjoint method to update the KAN activation functions.
Figure 3: Comparison between KAN-ODE and Neural ODE for Lotka-Volterra predator-prey model. (A) Synthetic data (training and testing) and KAN-ODE reconstruction. (B) Loss profile during training for (B1) KAN-ODE and (B2) MLP-based Neural ODE of comparable sizes (240 and 252, respectively). (C) Comparison of converged KAN-ODE and Neural ODE error using different model sizes, and two MLP depths ($d=2$ and $d=3$). Neural scaling rates of $N^{-2}$ and $N^{-4}$ plotted for comparison, as per the theory in liu_kan_2024.
Figure 4: Sparsification, pruning, symbolic regression, and generalization results for Lotka-Volterra dynamics. (A) Sparse KAN with 72 parameters ([2, 3, 5], [3, 2, 5]), pruned from an initial 240 parameters ([2, 10, 5], [10, 2, 5]). (B) Symbolic KAN derived from (A), where each of the twelve activations is replaced by a univariate symbolic expression. (C) Generalization error when extrapolating outside of the yellow $(x, y)$ points explored in the training data. Each column represents a different gradient getter and has one row each for $dx/dt$ and $dy/dt$. Generalization is seen to improve when moving from MLP gradients to increasingly sparse KANs, and finally to symbolic representations fitted to KANs.
Figure 5: Fisher-KPP equation: (A) Solution field $u(x,t)$ of the ground truth. (B) Solution field $u(x,t)$ of the prediction with a learned KAN-ODE model. (C) Loss function. (D) Learned hidden physics of the reaction source term in the Fisher-KPP equation and its symbolic form.
...and 5 more figures

KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics

TL;DR

Abstract

KAN-ODEs: Kolmogorov-Arnold Network Ordinary Differential Equations for Learning Dynamical Systems and Hidden Physics

Authors

TL;DR

Abstract

Table of Contents

Figures (10)