A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks
Khemraj Shukla, Juan Diego Toscano, Zhicheng Wang, Zongren Zou, George Em Karniadakis
TL;DR
<3-5 sentence high-level summary> The paper conducts a comprehensive comparison between MLP-based PINNs/DeepONets and KAN-based representations (PIKAN, cPIKAN, and DeepOKAN) for solving forward and inverse differential equations and learning operators. It shows that vanilla KANs with B-splines can be slow and less accurate, while low-order polynomial KANs (notably Chebyshev-based variants) achieve competitive accuracy, with cPIKAN offering favorable parameter efficiency and robustness under certain conditions. The study also demonstrates that residual-based attention and entropy-viscosity stabilization can substantially improve performance for PDE problems, and analyzes training dynamics through information bottleneck theory, identifying fitting, diffusion, and total-diffusion stages. Overall, the work provides a strong FAIR benchmarking framework and highlights when KAN-based approaches can match or exceed traditional PINN/DeepONet performance, as well as directions for stability, scalability, and uncertainty quantification in SciML. The results have practical implications for selecting representation models in physics-informed learning and operator regression tasks across a range of PDEs and high-dimensional problems.
Abstract
Kolmogorov-Arnold Networks (KANs) were recently introduced as an alternative representation model to MLP. Herein, we employ KANs to construct physics-informed machine learning models (PIKANs) and deep operator models (DeepOKANs) for solving differential equations for forward and inverse problems. In particular, we compare them with physics-informed neural networks (PINNs) and deep operator networks (DeepONets), which are based on the standard MLP representation. We find that although the original KANs based on the B-splines parameterization lack accuracy and efficiency, modified versions based on low-order orthogonal polynomials have comparable performance to PINNs and DeepONet although they still lack robustness as they may diverge for different random seeds or higher order orthogonal polynomials. We visualize their corresponding loss landscapes and analyze their learning dynamics using information bottleneck theory. Our study follows the FAIR principles so that other researchers can use our benchmarks to further advance this emerging topic.
