Table of Contents
Fetching ...

Residual Multi-Fidelity Neural Network Computing

Owen Davis, Mohammad Motamed, Raul Tempone

TL;DR

This paper introduces the Residual Multi-Fidelity Neural Network (RMFNN), a theoretically grounded framework for constructing fast high-fidelity surrogates by learning a nonlinear residual $F$ between a low-fidelity model $Q_{LF}$ and a high-fidelity model $Q_{HF}$. The approach trains two networks in tandem: ResNN learns the residual $F(\boldsymbol{\theta}, Q_{LF}(\boldsymbol{\theta}))$, enabling generation of synthetic high-fidelity data, and DNN learns the high-fidelity quantity $Q_{HF}$ using both original and synthetic data. A key theoretical result from Davis_Motamed:2024 guarantees the existence of a ReLU network approximating a broad class of targets with network complexity tied to the uniform norm of the target, justifying low-complexity learning when the residual is small. Numerical experiments show substantial computational savings, particularly when $||F||_{L^{\infty}}$ is small, and illustrate the framework's advantages over direct high-fidelity learning and discrepancy-based rivals, with clear paths toward multi-fidelity extensions and uncertainty quantification tasks.

Abstract

In this work, we consider the general problem of constructing a neural network surrogate model using multi-fidelity information. Motivated by error-complexity estimates for ReLU neural networks, we formulate the correlation between an inexpensive low-fidelity model and an expensive high-fidelity model as a possibly non-linear residual function. This function defines a mapping between 1) the shared input space of the models along with the low-fidelity model output, and 2) the discrepancy between the outputs of the two models. The computational framework proceeds by training two neural networks to work in concert. The first network learns the residual function on a small set of high- and low-fidelity data. Once trained, this network is used to generate additional synthetic high-fidelity data, which is used in the training of the second network. The trained second network then acts as our surrogate for the high-fidelity quantity of interest. We present four numerical examples to demonstrate the power of the proposed framework, showing that significant savings in computational cost may be achieved when the output predictions are desired to be accurate within small tolerances.

Residual Multi-Fidelity Neural Network Computing

TL;DR

This paper introduces the Residual Multi-Fidelity Neural Network (RMFNN), a theoretically grounded framework for constructing fast high-fidelity surrogates by learning a nonlinear residual between a low-fidelity model and a high-fidelity model . The approach trains two networks in tandem: ResNN learns the residual , enabling generation of synthetic high-fidelity data, and DNN learns the high-fidelity quantity using both original and synthetic data. A key theoretical result from Davis_Motamed:2024 guarantees the existence of a ReLU network approximating a broad class of targets with network complexity tied to the uniform norm of the target, justifying low-complexity learning when the residual is small. Numerical experiments show substantial computational savings, particularly when is small, and illustrate the framework's advantages over direct high-fidelity learning and discrepancy-based rivals, with clear paths toward multi-fidelity extensions and uncertainty quantification tasks.

Abstract

In this work, we consider the general problem of constructing a neural network surrogate model using multi-fidelity information. Motivated by error-complexity estimates for ReLU neural networks, we formulate the correlation between an inexpensive low-fidelity model and an expensive high-fidelity model as a possibly non-linear residual function. This function defines a mapping between 1) the shared input space of the models along with the low-fidelity model output, and 2) the discrepancy between the outputs of the two models. The computational framework proceeds by training two neural networks to work in concert. The first network learns the residual function on a small set of high- and low-fidelity data. Once trained, this network is used to generate additional synthetic high-fidelity data, which is used in the training of the second network. The trained second network then acts as our surrogate for the high-fidelity quantity of interest. We present four numerical examples to demonstrate the power of the proposed framework, showing that significant savings in computational cost may be achieved when the output predictions are desired to be accurate within small tolerances.
Paper Structure (17 sections, 1 theorem, 42 equations, 11 figures, 7 tables)

This paper contains 17 sections, 1 theorem, 42 equations, 11 figures, 7 tables.

Key Result

Theorem 1

Let $Q: \Theta \subseteq {\mathbb R}^d \rightarrow {\mathbb R}$ be a bounded function defined on the compact set $\Theta$, and assume $Q$ admits an extension $Q_{e}:\mathbb{R}^d\rightarrow\mathbb{R}$ such that $Q_{e}\in S$ and $||Q_{e}||_{L^{\infty}(\mathbb{R}^d)}\leq c||Q||_{L^{\infty}(\Theta)}$ fo such that where $C$ is a positive constant that may depend linearly on $d$ and logarithmically on

Figures (11)

  • Figure 1: Motivation behind the new residual multi-fidelity formulation. Left: deviation of the low-fidelity solution $Q_{LF}$ from the high-fidelity solution $Q_{HF}$, versus a frequency parameter $\theta \in \Theta=[10,50]$. Middle: high-fidelity solution is not a linear function of the low-fidelity solution on the whole parameter space. Right: residual function $F$ has a small magnitude.
  • Figure 2: Schematic representation of the RMFNN algorithm, given a set of $N$ low-fidelity and $N_I \ll N$ high-fidelity data. Top: an initial network ($ResNN$) is trained by $N_I$ low-fidelity and high-fidelity data to learn the residual function $F$. The trained $ResNN$ is used along with the rest of $N_{II}=N-N_I$ low-fidelity data to generate a new set of $N_{II}$ high-fidelity data. Bottom: a deep network ($DNN$) is trained by all $N$ high-fidelity data as a surrogate for the high-fidelity target quantity $Q_{HF}$.
  • Figure 3: An alternative RMFNN approach. A deep network ($DNN$) is trained using $N$ low-fidelity data to learn $Q_{LF}$. A second network ($ResNN$) is trained by $N_I \ll N$ low-fidelity and high-fidelity data as a surrogate for the residual $F$. Finally, the target quantity $Q_{HF}$ is computed by adding $Q_{LF}$ to $F$.
  • Figure 4: Two different residual functions: on the left $G(\theta)$, and on the right $F(\theta, Q_{LF}(\theta))$
  • Figure 5: Mean squared error $\varepsilon_{MSE}$ over 20 training runs approximating $Q_{HF}(\theta) - Q_{LF}(\theta)$ by RMFNN ResNet (red), and DiscrepNN ResNet (blue); average MSE over each ensemble is indicated by the larger marker with bold outline
  • ...and 6 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof