Table of Contents
Fetching ...

Construction of the Kolmogorov-Arnold representation using the Newton-Kaczmarz method

Michael Poluektov, Andrew Polar

TL;DR

This work develops a data-driven framework to construct a Kolmogorov-Arnold representation for multivariate inputs by parameterizing inner and outer functions with flexible basis sets and estimating the parameters via the Newton-Kaczmarz method. It demonstrates that the KA model, when trained with NK, can serve as an effective PDE solver and surrogate for nonlinear mappings, offering robustness to initial guesses and enabling parallelization. Across ridge-function, nonlinear-function, PDE, and solid-mechanics examples, the KA-NK approach achieves high accuracy with reduced training time and memory footprint compared to Gauss-Newton methods and neural networks. The results underscore the method’s practical potential as a hybrid between data-driven modelling and discretized physics-based solvers, with open-source implementations provided.

Abstract

It is known that any continuous multivariate function can be represented exactly by a composition functions of a single variable - the so-called Kolmogorov-Arnold representation. It can be a convenient tool for tasks where it is required to obtain a predictive model that maps some vector input of a black box system into a scalar output. In this case, the representation may not be exact, and it is more correct to refer to such structure as the Kolmogorov-Arnold model (or, as more recently popularised, 'network'). Construction of such model based on the recorded input-output data is a challenging task. In the present paper, it is suggested to decompose the underlying functions of the representation into continuous basis functions and parameters. It is then proposed to find the parameters using the Newton-Kaczmarz method for solving systems of non-linear equations. The algorithm is then modified to support parallelisation. The paper demonstrates that such approach is also an excellent tool for data-driven solution of partial differential equations. Numerical examples show that for the considered model, the Newton-Kaczmarz method for parameter estimation is efficient and more robust with respect to the section of the initial guess than the straightforward application of the Gauss-Newton method. Finally, the Kolmogorov-Arnold model is compared to the MATLAB's built-in neural networks on a relatively large-scale problem (25 inputs, datasets of 10 million records), significantly outperforming the multilayer perceptrons (MLPs) in this particular problem (4-10 minutes vs. 4-8 hours of training time, as well as higher accuracy, lower CPU usage, and smaller memory footprint).

Construction of the Kolmogorov-Arnold representation using the Newton-Kaczmarz method

TL;DR

This work develops a data-driven framework to construct a Kolmogorov-Arnold representation for multivariate inputs by parameterizing inner and outer functions with flexible basis sets and estimating the parameters via the Newton-Kaczmarz method. It demonstrates that the KA model, when trained with NK, can serve as an effective PDE solver and surrogate for nonlinear mappings, offering robustness to initial guesses and enabling parallelization. Across ridge-function, nonlinear-function, PDE, and solid-mechanics examples, the KA-NK approach achieves high accuracy with reduced training time and memory footprint compared to Gauss-Newton methods and neural networks. The results underscore the method’s practical potential as a hybrid between data-driven modelling and discretized physics-based solvers, with open-source implementations provided.

Abstract

It is known that any continuous multivariate function can be represented exactly by a composition functions of a single variable - the so-called Kolmogorov-Arnold representation. It can be a convenient tool for tasks where it is required to obtain a predictive model that maps some vector input of a black box system into a scalar output. In this case, the representation may not be exact, and it is more correct to refer to such structure as the Kolmogorov-Arnold model (or, as more recently popularised, 'network'). Construction of such model based on the recorded input-output data is a challenging task. In the present paper, it is suggested to decompose the underlying functions of the representation into continuous basis functions and parameters. It is then proposed to find the parameters using the Newton-Kaczmarz method for solving systems of non-linear equations. The algorithm is then modified to support parallelisation. The paper demonstrates that such approach is also an excellent tool for data-driven solution of partial differential equations. Numerical examples show that for the considered model, the Newton-Kaczmarz method for parameter estimation is efficient and more robust with respect to the section of the initial guess than the straightforward application of the Gauss-Newton method. Finally, the Kolmogorov-Arnold model is compared to the MATLAB's built-in neural networks on a relatively large-scale problem (25 inputs, datasets of 10 million records), significantly outperforming the multilayer perceptrons (MLPs) in this particular problem (4-10 minutes vs. 4-8 hours of training time, as well as higher accuracy, lower CPU usage, and smaller memory footprint).
Paper Structure (32 sections, 65 equations, 4 figures, 5 tables)

This paper contains 32 sections, 65 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The dependence of the RMSE on the number of passes through the data for different values of $\mu$.
  • Figure 2: The sections of function $u\left(x_1,x_2\right)$ at $x_2=0.5$ (blue), $x_2=1$ (brown), and $x_2=1.5$ (red). Solid lines are the numerical solutions of the second-order PDE, which are obtained by training the Kolmogorov-Arnold model using the different number of passes (batch updates). Symbols show the analytical solution.
  • Figure 3: The dependence of the non-normalised RMSE on the mesh size corresponding to the inputs (a) and on the number of domain-internal points for the training (b). The mesh size is defined as $2/n$, where $2$ is the size of the domain and $n$ is the number of the inner basis functions. The vertical line in (b) indicates $N_\mathrm{I}$ equal to the number of the model's parameters. The red and the blue colours correspond to the solutions for $u_1$ and $u_2$, respectively.
  • Figure 4: The comparison of the performance of the Kolmogorov-Arnold (KA) model and the MATLAB's built-in neural networks (a) in the problem of building a regression model of a determinant of a $5$-by-$5$ matrix. Evolution of the accuracy of the KA model (measured on the validation dataset) after each pass through the training dataset (b).