Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models
Jiaxiang Yi, Ji Cheng, Miguel A. Bessa
TL;DR
The paper tackles efficient high-fidelity prediction with limited HF data by proposing a practical MF-BML framework that blends a deterministic LF surrogate with a Bayesian HF residual via a simple transfer-learning step. It formalizes the HF predictor as $f^h(\mathbf{x})=g(f^l(\mathbf{x}),\mathbf{x})+r(\mathbf{x})$, where $g$ is a linear transfer on LF outputs and $r$ is modeled probabilistically, enabling uncertainty quantification. Two concrete configurations are developed: (i) $KRR-LR-GPR$ for data-scarce problems, where $f^l$ is kernel ridge regression and the HF residual is Gaussian process regression, and (ii) $DNN-LR-BNN$ for data-rich settings, where a deep neural network LF model is augmented with a linear transfer-learning step and a Bayesian neural network residual; the coefficients for transfer-learning $\boldsymbol{\rho}$ are learned from HF data rather than treated as hyperparameters. The framework demonstrates comparable mean and uncertainty performance to state-of-the-art MF methods but with reduced training time and better scalability in data-scarce regimes, and shows improved performance in high-dimensional, multi-output problems, with code publicly available. This approach offers a parsimonious yet flexible path for practical multi-fidelity modeling in engineering and science.
Abstract
Multi-fidelity machine learning methods address the accuracy-efficiency trade-off by integrating scarce, resource-intensive high-fidelity data with abundant but less accurate low-fidelity data. We propose a practical multi-fidelity strategy for problems spanning low- and high-dimensional domains, integrating a non-probabilistic regression model for the low-fidelity with a Bayesian model for the high-fidelity. The models are trained in a staggered scheme, where the low-fidelity model is transfer-learned to the high-fidelity data and a Bayesian model is trained to learn the residual between the data and the transfer-learned model. This three-model strategy -- deterministic low-fidelity, transfer-learning, and Bayesian residual -- leads to a prediction that includes uncertainty quantification for noisy and noiseless multi-fidelity data. The strategy is general and unifies the topic, highlighting the expressivity trade-off between the transfer-learning and Bayesian models (a complex transfer-learning model leads to a simpler Bayesian model, and vice versa). We propose modeling choices for two scenarios, and argue in favor of using a linear transfer-learning model that fuses 1) kernel ridge regression for low-fidelity with Gaussian processes for high-fidelity; or 2) deep neural network for low-fidelity with a Bayesian neural network for high-fidelity. We demonstrate the effectiveness and efficiency of the proposed strategies and contrast them with the state-of-the-art based on various numerical examples and two engineering problems. The results indicate that the proposed approach achieves comparable performance in both mean and uncertainty estimation while significantly reducing training time for machine learning modeling in data-scarce scenarios. Moreover, in data-rich settings, it outperforms other multi-fidelity architectures by effectively mitigating overfitting.
