Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

A. Diaw; M. McKerns; I. Sagert; L. G. Stanton; M. S. Murillo

Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

A. Diaw, M. McKerns, I. Sagert, L. G. Stanton, M. S. Murillo

TL;DR

The paper tackles the challenge of building fast, accurate surrogates for expensive simulations when data are noisy, sparse, or time-dependent. It introduces an online learning framework that couples optimizer-directed sampling with radial-basis-function surrogates (thin-plate RBF) and automatic retraining when a validity score falls below a threshold, aiming for asymptotic validity on future data. Key contributions include formal definitions of asymptotic and training validity, an ensemble-based optimizer strategy to locate critical points, and demonstrations on benchmark functions and a nuclear-matter equation of state with a phase transition, achieving high accuracy from relatively few evaluations. The approach, with open-source code and data, offers a data-efficient workflow for robust surrogate generation applicable to complex, time-dependent physics problems.

Abstract

Machine learning methods are increasingly used to build computationally inexpensive surrogates for complex physical models. The predictive capability of these surrogates suffers when data are noisy, sparse, or time-dependent. As we are interested in finding a surrogate that provides valid predictions of any potential future model evaluations, we introduce an online learning method empowered by optimizer-driven sampling. The method has two advantages over current approaches. First, it ensures that all turning points on the model response surface are included in the training data. Second, after any new model evaluations, surrogates are tested and "retrained" (updated) if the "score" drops below a validity threshold. Tests on benchmark functions reveal that optimizer-directed sampling generally outperforms traditional sampling methods in terms of accuracy around local extrema, even when the scoring metric favors overall accuracy. We apply our method to simulations of nuclear matter to demonstrate that highly accurate surrogates for the nuclear equation of state can be reliably auto-generated from expensive calculations using a few model evaluations.

Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

TL;DR

Abstract

Paper Structure (2 sections, 16 equations, 6 figures, 1 table)

This paper contains 2 sections, 16 equations, 6 figures, 1 table.

Sampling for Asymptotic Validity.
Sampling for Training Validity.

Figures (6)

Figure 1: Schematic for Automated Generation of Inexpensive Surrogates for Complex Systems. When new model evaluations occur, the corresponding surrogate is retrieved and evaluated for the same data. If the surrogate is determined to still be valid, the execution stops. Otherwise, the surrogate is updated by retraining against the DB of stored model evaluations, where the surrogate is validated with a fine-tuning of surrogate hyperparameters against a quality metric. If iterative retraining improves the surrogate, it is saved. Otherwise, new model evaluations are sampled to generate additional data. The process repeats until testing produces a valid surrogate.
Figure 2: Candidate surrogates for the 2-dimensional Rastrigin function, learned with a thin-plate RBF estimator using "sparsity" sampling, a "strict" tolerance, and a test metric for validity based on the average graphical distance between the learned surrogate and sampled data. Surrogates are plotted with inputs $x = (x_0, x_1)$ and output $z = f(x)$. Top row: Sampling using ensembles of $16$ optimizers after the initial and tenth iterations. Bottom row: the final iteration and the test score per sample. The final surrogate is visually identical to ground truth and reproduces all local extrema within the desired accuracy. The test score for pure systematic random sampling converges faster than optimizer-directed sampling, as may be expected for a metric based on the average surrogate misfit.
Figure 3: Convergence of $test$ score versus model evaluations for different benchmark functions learned with a thin-plate RBF estimator using "sparsity" sampling, and a test metric for validity based on the average graphical distance between the learned surrogate and sampled data. Top left: 2-dimensional Rastrigin function with "loose" tolerance. Top right: 2-dimensional Easom function with "strict" tolerance. Bottom left: 2-dimensional Rosenbrock function at strict tolerance. Bottom right: 2-dimensional Michalewicz function at loose tolerance. Note the surrogates on the top row reproduce ground truth well regardless of approach, as in Figure \ref{['fig:Rastrigin']}. Performance of either approach for the 2-D Rosenbrock is as occurs for the 8-D Rosenbrock, in Figure \ref{['fig:Rosenbrock8']} Both random and optimizer-directed approaches do equally poorly with the 2-D Michalewicz.
Figure 4: Candidate surrogates for the 2-dimensional Rosenbrock function, learned with a thin-plate RBF estimator using "sparsity" sampling, a "loose" tolerance, and a test metric for validity based on the average graphical distance between the learned surrogate and sampled data. Surrogates are plotted with inputs $x = (x_0, x_1)$ and output $z = Z(f(x))$, where log-scaling $Z = log(4 \cdot f(x) + 1) + 2$ is used to view the region around the global minimum better. Top row, left: model evaluations sampled with the random sampling strategy. Top row, right: optimizer-directed sampling. Bottom row, left: log-scaled view of surrogate from random sampling. Bottom row, right: log-scaled view of surrogate from optimizer-directed sampling, which reproduces ground truth. Note convergence occurs quickly using either strategy, where the $converged$ condition is met with no more than two iterations.
Figure 5: Candidate surrogates for the 8-dimensional Rosenbrock function, learned with a thin-plate RBF estimator using "sparsity" sampling, a "loose" tolerance, and a test metric for validity based on the average graphical distance between the learned surrogate and sampled data. Surrogates are plotted with inputs $x = (x_0, x_1, 1, 1, 1, 1, 1, 1)$ and output $z = f(x)$ or $z = Z(f(x))$, where log-scaling $Z = log(4 \cdot f(x) + 1) + 2$ is used to view the region around the global minimum better. Top row, left: test score per sample. Top row, center: model evaluations sampled with the random sampling strategy. Top row, right: optimizer-directed sampling. Bottom row, left: surrogates produced with either sampling approach are visually identical to the ground truth. Bottom row, center: log-scaled view of surrogate from random sampling near the global minimum. Bottom row, right: log-scaled view of surrogate from optimizer-directed sampling near the global minimum, identical to the ground truth. While pure systematic random sampling converges faster, optimizer-directed sampling provides a more accurate surrogate near the critical points.
...and 1 more figures

Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

TL;DR

Abstract

Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (6)