Residual Multi-Fidelity Neural Network Computing

Owen Davis; Mohammad Motamed; Raul Tempone

Residual Multi-Fidelity Neural Network Computing

Owen Davis, Mohammad Motamed, Raul Tempone

TL;DR

This paper introduces the Residual Multi-Fidelity Neural Network (RMFNN), a theoretically grounded framework for constructing fast high-fidelity surrogates by learning a nonlinear residual $F$ between a low-fidelity model $Q_{LF}$ and a high-fidelity model $Q_{HF}$. The approach trains two networks in tandem: ResNN learns the residual $F(\boldsymbol{\theta}, Q_{LF}(\boldsymbol{\theta}))$, enabling generation of synthetic high-fidelity data, and DNN learns the high-fidelity quantity $Q_{HF}$ using both original and synthetic data. A key theoretical result from Davis_Motamed:2024 guarantees the existence of a ReLU network approximating a broad class of targets with network complexity tied to the uniform norm of the target, justifying low-complexity learning when the residual is small. Numerical experiments show substantial computational savings, particularly when $||F||_{L^{\infty}}$ is small, and illustrate the framework's advantages over direct high-fidelity learning and discrepancy-based rivals, with clear paths toward multi-fidelity extensions and uncertainty quantification tasks.

Abstract

In this work, we consider the general problem of constructing a neural network surrogate model using multi-fidelity information. Motivated by error-complexity estimates for ReLU neural networks, we formulate the correlation between an inexpensive low-fidelity model and an expensive high-fidelity model as a possibly non-linear residual function. This function defines a mapping between 1) the shared input space of the models along with the low-fidelity model output, and 2) the discrepancy between the outputs of the two models. The computational framework proceeds by training two neural networks to work in concert. The first network learns the residual function on a small set of high- and low-fidelity data. Once trained, this network is used to generate additional synthetic high-fidelity data, which is used in the training of the second network. The trained second network then acts as our surrogate for the high-fidelity quantity of interest. We present four numerical examples to demonstrate the power of the proposed framework, showing that significant savings in computational cost may be achieved when the output predictions are desired to be accurate within small tolerances.

Residual Multi-Fidelity Neural Network Computing

TL;DR

This paper introduces the Residual Multi-Fidelity Neural Network (RMFNN), a theoretically grounded framework for constructing fast high-fidelity surrogates by learning a nonlinear residual

between a low-fidelity model

and a high-fidelity model

. The approach trains two networks in tandem: ResNN learns the residual

, enabling generation of synthetic high-fidelity data, and DNN learns the high-fidelity quantity

using both original and synthetic data. A key theoretical result from Davis_Motamed:2024 guarantees the existence of a ReLU network approximating a broad class of targets with network complexity tied to the uniform norm of the target, justifying low-complexity learning when the residual is small. Numerical experiments show substantial computational savings, particularly when

is small, and illustrate the framework's advantages over direct high-fidelity learning and discrepancy-based rivals, with clear paths toward multi-fidelity extensions and uncertainty quantification tasks.

Abstract

Paper Structure (17 sections, 1 theorem, 42 equations, 11 figures, 7 tables)

This paper contains 17 sections, 1 theorem, 42 equations, 11 figures, 7 tables.

Introduction
Problem formulation and background
Problem statement
Multi-fidelity modeling
Neural network approximation
Residual multi-fidelity neural network algorithm
Nonlinear residual multi-fidelity modeling
Composite network training
Error and complexity in RMFNN
Sources of error
Computational complexity
Numerical examples
A comparison of residual formulations
The impact of a small residual uniform norm
A parametric ODE problem
...and 2 more sections

Key Result

Theorem 1

Let $Q: \Theta \subseteq {\mathbb R}^d \rightarrow {\mathbb R}$ be a bounded function defined on the compact set $\Theta$, and assume $Q$ admits an extension $Q_{e}:\mathbb{R}^d\rightarrow\mathbb{R}$ such that $Q_{e}\in S$ and $||Q_{e}||_{L^{\infty}(\mathbb{R}^d)}\leq c||Q||_{L^{\infty}(\Theta)}$ fo such that where $C$ is a positive constant that may depend linearly on $d$ and logarithmically on

Figures (11)

Figure 1: Motivation behind the new residual multi-fidelity formulation. Left: deviation of the low-fidelity solution $Q_{LF}$ from the high-fidelity solution $Q_{HF}$, versus a frequency parameter $\theta \in \Theta=[10,50]$. Middle: high-fidelity solution is not a linear function of the low-fidelity solution on the whole parameter space. Right: residual function $F$ has a small magnitude.
Figure 2: Schematic representation of the RMFNN algorithm, given a set of $N$ low-fidelity and $N_I \ll N$ high-fidelity data. Top: an initial network ($ResNN$) is trained by $N_I$ low-fidelity and high-fidelity data to learn the residual function $F$. The trained $ResNN$ is used along with the rest of $N_{II}=N-N_I$ low-fidelity data to generate a new set of $N_{II}$ high-fidelity data. Bottom: a deep network ($DNN$) is trained by all $N$ high-fidelity data as a surrogate for the high-fidelity target quantity $Q_{HF}$.
Figure 3: An alternative RMFNN approach. A deep network ($DNN$) is trained using $N$ low-fidelity data to learn $Q_{LF}$. A second network ($ResNN$) is trained by $N_I \ll N$ low-fidelity and high-fidelity data as a surrogate for the residual $F$. Finally, the target quantity $Q_{HF}$ is computed by adding $Q_{LF}$ to $F$.
Figure 4: Two different residual functions: on the left $G(\theta)$, and on the right $F(\theta, Q_{LF}(\theta))$
Figure 5: Mean squared error $\varepsilon_{MSE}$ over 20 training runs approximating $Q_{HF}(\theta) - Q_{LF}(\theta)$ by RMFNN ResNet (red), and DiscrepNN ResNet (blue); average MSE over each ensemble is indicated by the larger marker with bold outline
...and 6 more figures

Theorems & Definitions (2)

Theorem 1
proof

Residual Multi-Fidelity Neural Network Computing

TL;DR

Abstract

Residual Multi-Fidelity Neural Network Computing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (2)