Stochastic hierarchical data-driven optimization: application to plasma-surface kinetics
José Afonso, Vasco Guerra, Pedro Viegas
TL;DR
The paper tackles calibrating high-dimensional, sloppy physical models when simulations are expensive and data are scarce. It introduces a stochastic hierarchical optimization that identifies stiff parameter directions via a reduced Gauss-Newton Hessian and navigates the loss landscape with a heuristic meta-simulator, underpinned by a probabilistic loss derived from observed data. Applied to a plasma-surface kinetics model for atomic oxygen recombination on Pyrex, the method achieves rapid, sample-efficient convergence (validated on 225 conditions, with test R^2 ≈ 0.736) and yields uncertainty estimates that highlight the most influential parameters. This approach offers a transferable, scalable tool for inverse problems in complex reaction systems and provides a principled way to quantify parameter stiffness and predictive uncertainty in data-sparse regimes.
Abstract
This work introduces a stochastic hierarchical optimization framework inspired by Sloppy Model theory for the efficient calibration of physical models. Central to this method is the use of a reduced Hessian approximation, which identifies and targets the stiff parameter subspace using minimal simulation queries. This strategy enables efficient navigation of highly anisotropic landscapes, avoiding the computational burden of exhaustive sampling. To ensure rigorous inference, we integrate this approach with a probabilistic formulation that derives a principled objective loss function directly from observed data. We validate the framework by applying it to the problem of plasma-surface interactions, where accurate modelling is strictly limited by uncertainties in surface reactivity parameters and the computational cost of kinetic simulations. Comparative analysis demonstrates that our method consistently outperforms baseline optimization techniques in sample efficiency. This approach offers a general and scalable tool for optimizing models of complex reaction systems, ranging from plasma chemistry to biochemical networks.
