Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models

Jiaxiang Yi; Ji Cheng; Miguel A. Bessa

Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models

Jiaxiang Yi, Ji Cheng, Miguel A. Bessa

TL;DR

The paper tackles efficient high-fidelity prediction with limited HF data by proposing a practical MF-BML framework that blends a deterministic LF surrogate with a Bayesian HF residual via a simple transfer-learning step. It formalizes the HF predictor as $f^h(\mathbf{x})=g(f^l(\mathbf{x}),\mathbf{x})+r(\mathbf{x})$, where $g$ is a linear transfer on LF outputs and $r$ is modeled probabilistically, enabling uncertainty quantification. Two concrete configurations are developed: (i) $KRR-LR-GPR$ for data-scarce problems, where $f^l$ is kernel ridge regression and the HF residual is Gaussian process regression, and (ii) $DNN-LR-BNN$ for data-rich settings, where a deep neural network LF model is augmented with a linear transfer-learning step and a Bayesian neural network residual; the coefficients for transfer-learning $\boldsymbol{\rho}$ are learned from HF data rather than treated as hyperparameters. The framework demonstrates comparable mean and uncertainty performance to state-of-the-art MF methods but with reduced training time and better scalability in data-scarce regimes, and shows improved performance in high-dimensional, multi-output problems, with code publicly available. This approach offers a parsimonious yet flexible path for practical multi-fidelity modeling in engineering and science.

Abstract

Multi-fidelity machine learning methods address the accuracy-efficiency trade-off by integrating scarce, resource-intensive high-fidelity data with abundant but less accurate low-fidelity data. We propose a practical multi-fidelity strategy for problems spanning low- and high-dimensional domains, integrating a non-probabilistic regression model for the low-fidelity with a Bayesian model for the high-fidelity. The models are trained in a staggered scheme, where the low-fidelity model is transfer-learned to the high-fidelity data and a Bayesian model is trained to learn the residual between the data and the transfer-learned model. This three-model strategy -- deterministic low-fidelity, transfer-learning, and Bayesian residual -- leads to a prediction that includes uncertainty quantification for noisy and noiseless multi-fidelity data. The strategy is general and unifies the topic, highlighting the expressivity trade-off between the transfer-learning and Bayesian models (a complex transfer-learning model leads to a simpler Bayesian model, and vice versa). We propose modeling choices for two scenarios, and argue in favor of using a linear transfer-learning model that fuses 1) kernel ridge regression for low-fidelity with Gaussian processes for high-fidelity; or 2) deep neural network for low-fidelity with a Bayesian neural network for high-fidelity. We demonstrate the effectiveness and efficiency of the proposed strategies and contrast them with the state-of-the-art based on various numerical examples and two engineering problems. The results indicate that the proposed approach achieves comparable performance in both mean and uncertainty estimation while significantly reducing training time for machine learning modeling in data-scarce scenarios. Moreover, in data-rich settings, it outperforms other multi-fidelity architectures by effectively mitigating overfitting.

Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models

TL;DR

, where

is a linear transfer on LF outputs and

is modeled probabilistically, enabling uncertainty quantification. Two concrete configurations are developed: (i)

for data-scarce problems, where

is kernel ridge regression and the HF residual is Gaussian process regression, and (ii)

for data-rich settings, where a deep neural network LF model is augmented with a linear transfer-learning step and a Bayesian neural network residual; the coefficients for transfer-learning

are learned from HF data rather than treated as hyperparameters. The framework demonstrates comparable mean and uncertainty performance to state-of-the-art MF methods but with reduced training time and better scalability in data-scarce regimes, and shows improved performance in high-dimensional, multi-output problems, with code publicly available. This approach offers a parsimonious yet flexible path for practical multi-fidelity modeling in engineering and science.

Abstract

Paper Structure (5 sections, 5 equations, 1 figure, 1 table)

This paper contains 5 sections, 5 equations, 1 figure, 1 table.

Introduction
Methodology and related work
Related work for data-scarce scenarios
Related work for data-rich scenarios
Practical multi-fidelity Bayesian machine learning

Figures (1)

Figure 1: Schematic (a) shows a linear regression model with basis function $\mathbf{h(x)} = [1, \mathbf{x}, \mathbf{x}^2, ...]^T$ and coefficients $\boldsymbol{\beta}$ that is augmented by the residual $\delta(\mathbf{x})$ modeled by zero mean GPR. Schematic (b) shows the proposed MF-BML strategy where the transfer-learning model $g\left(f^l(\mathbf{x}) \right)$ is a linear regression model whose features are the outputs of the LF surrogate model $f^l(\mathbf{x})$ obtained by training on LF data. The linear transfer-learning function that acts on $f^l(\mathbf{x})$ adjusts the LF model to the HF data by determining the coefficients $\boldsymbol{\rho}$, and facilitates the determination of the residual $r(\mathbf{x})$ by a Bayesian model with a simple prior.

Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models

TL;DR

Abstract

Practical multi-fidelity machine learning: fusion of deterministic and Bayesian models

Authors

TL;DR

Abstract

Table of Contents

Figures (1)