Assessing the performance of correlation-based multi-fidelity neural emulators
Cristian J. Villatoro, Gianluca Geraci, Daniele E. Schiavazzi
TL;DR
This study systematically evaluates three neural architectures (MLP, Siren, KAN) for multi-fidelity emulation, integrating diverse low-fidelity sources with limited high-fidelity data. It introduces coordinate encoding to reconcile differing input spaces and compares standard versus encoded MF networks across 1D/2D, PDE, and high-dimensional problems, including discontinuities and phase shifts. Key findings show that MF frameworks typically outperform HF-only surrogates, with KAN delivering the strongest performance on complex and high-dimensional tasks, while coordinate encoding is most beneficial when LF–HF relationships exhibit misalignment or discontinuities. The results offer guidance on architecture choice and encoding strategies for efficient, accurate HF surrogates in computationally expensive scientific applications.
Abstract
Outer loop tasks such as optimization, uncertainty quantification or inference can easily become intractable when the underlying high-fidelity model is computationally expensive. Similarly, data-driven architectures typically require large datasets to perform predictive tasks with sufficient accuracy. A possible approach to mitigate these challenges is the development of multi-fidelity emulators, leveraging potentially biased, inexpensive low-fidelity information while correcting and refining predictions using scarce, accurate high-fidelity data. This study investigates the performance of multi-fidelity neural emulators, neural networks designed to learn the input-to-output mapping by integrating limited high-fidelity data with abundant low-fidelity model solutions. We investigate the performance of such emulators for low and high-dimensional functions, with oscillatory character, in the presence of discontinuities, for collections of models with equal and dissimilar parametrization, and for a possibly large number of potentially corrupted low-fidelity sources. In doing so, we consider a large number of architectural, hyperparameter, and dataset configurations including networks with a different amount of spectral bias (Multi-Layered Perceptron, Siren and Kolmogorov Arnold Network), various mechanisms for coordinate encoding, exact or learnable low-fidelity information, and for varying training dataset size. We further analyze the added value of the multi-fidelity approach by conducting equivalent single-fidelity tests for each case, quantifying the performance gains achieved through fusing multiple sources of information.
