Table of Contents
Fetching ...

A Kolmogorov-Arnold Surrogate Model for Chemical Equilibria: Application to Solid Solutions

Leonardo Boledi, Dirk Bosbach, Jenna Poonoosamy

Abstract

The computational cost of geochemical solvers is a challenging matter. For reactive transport simulations, where chemical calculations are performed up to billions of times, it is crucial to reduce the total computational time. Existing publications have explored various machine-learning approaches to determine the most effective data-driven surrogate model. In particular, multilayer perceptrons are widely employed due to their ability to recognize nonlinear relationships. In this work, we focus on the recent Kolmogorov-Arnold networks, where learnable spline-based functions replace classical fixed activation functions. This architecture has achieved higher accuracy with fewer trainable parameters and has become increasingly popular for solving partial differential equations. First, we train a surrogate model based on an existing cement system benchmark. Then, we move to an application case for the geological disposal of nuclear waste, i.e., the determination of radionuclide-bearing solids solubilities. To the best of our knowledge, this work is the first to investigate co-precipitation with radionuclide incorporation using data-driven surrogate models, considering increasing levels of thermodynamic complexity from simple mechanical mixtures to non-ideal solid solutions of binary (Ba,Ra)SO$_4$ and ternary (Sr,Ba,Ra)SO$_4$ systems. On the cement benchmark, we demonstrate that the Kolmogorov-Arnold architecture outperforms multilayer perceptrons in both absolute and relative error metrics, reducing them by 62% and 59%, respectively. On the binary and ternary radium solid solution models, Kolmogorov-Arnold networks maintain median prediction errors near $1\times10^{-3}$. This is the first step toward employing surrogate models to speed up reactive transport simulations and optimize the safety assessment of deep geological waste repositories.

A Kolmogorov-Arnold Surrogate Model for Chemical Equilibria: Application to Solid Solutions

Abstract

The computational cost of geochemical solvers is a challenging matter. For reactive transport simulations, where chemical calculations are performed up to billions of times, it is crucial to reduce the total computational time. Existing publications have explored various machine-learning approaches to determine the most effective data-driven surrogate model. In particular, multilayer perceptrons are widely employed due to their ability to recognize nonlinear relationships. In this work, we focus on the recent Kolmogorov-Arnold networks, where learnable spline-based functions replace classical fixed activation functions. This architecture has achieved higher accuracy with fewer trainable parameters and has become increasingly popular for solving partial differential equations. First, we train a surrogate model based on an existing cement system benchmark. Then, we move to an application case for the geological disposal of nuclear waste, i.e., the determination of radionuclide-bearing solids solubilities. To the best of our knowledge, this work is the first to investigate co-precipitation with radionuclide incorporation using data-driven surrogate models, considering increasing levels of thermodynamic complexity from simple mechanical mixtures to non-ideal solid solutions of binary (Ba,Ra)SO and ternary (Sr,Ba,Ra)SO systems. On the cement benchmark, we demonstrate that the Kolmogorov-Arnold architecture outperforms multilayer perceptrons in both absolute and relative error metrics, reducing them by 62% and 59%, respectively. On the binary and ternary radium solid solution models, Kolmogorov-Arnold networks maintain median prediction errors near . This is the first step toward employing surrogate models to speed up reactive transport simulations and optimize the safety assessment of deep geological waste repositories.
Paper Structure (14 sections, 9 equations, 7 figures, 5 tables)

This paper contains 14 sections, 9 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Sketch of an MLP (left) and a KAN (right) with two hidden layers. In KANs, instead of training for weights and biases, learnable activation functions are placed on each edge.
  • Figure 2: RMSE on the test set for the cement hydration case. The reference MLP model (blue) is shown against two KANs of different sizes (green and orange).
  • Figure 3: RRMSE on the test set for the cement hydration case. The reference MLP model (blue) is shown against two KANs of different sizes (green and orange).
  • Figure 4: Relative error on the test set for the radium uptake case with mechanical mixing. The errors are plotted for three KANs trained on datasets of different sizes, where $m$ denotes the exponent of the Sobol sampler. In red, we show the number of predictions with an error above $10\%$.
  • Figure 5: Relative error on the test set for the radium uptake case with mechanical mixing. The classical MLP model (blue) is shown against two KANs of different sizes (green and orange). In red, we show the number of predictions with an error above $10\%$.
  • ...and 2 more figures