Quadrature Sampling of Parametric Models with Bi-fidelity Boosting
Nuojin Cheng, Osman Asif Malik, Yiming Xu, Stephen Becker, Alireza Doostan, Akil Narayan
TL;DR
This work tackles the costly task of building emulators for parametric maps in forward UQ by combining quadrature-based LS with a novel bi-fidelity boosting (BFB) framework. BFB uses a cheap low-fidelity model to identify an effective sketch and then applies that sketch to the expensive high-fidelity data, achieving a residual close to the ideal boosted solution with substantially fewer high-fidelity evaluations. The paper provides pre-asymptotic and asymptotic analyses, including optimality bounds and Gaussian-sketch correlation results, and validates the approach on synthetic and PDE datasets, showing meaningful reductions in data requirements and improved regression accuracy. The results offer practical guidance on when boosting helps, governed by the correlation between low- and high-fidelity data, and establish connections to leverage-score and volume-based sketching techniques for efficient surrogate construction in PDE UQ workflows.
Abstract
Least squares regression is a ubiquitous tool for building emulators (a.k.a. surrogate models) of problems across science and engineering for purposes such as design space exploration and uncertainty quantification. When the regression data are generated using an experimental design process (e.g., a quadrature grid) involving computationally expensive models, or when the data size is large, sketching techniques have shown promise to reduce the cost of the construction of the regression model while ensuring accuracy comparable to that of the full data. However, random sketching strategies, such as those based on leverage scores, lead to regression errors that are random and may exhibit large variability. To mitigate this issue, we present a novel boosting approach that leverages cheaper, lower-fidelity data of the problem at hand to identify the best sketch among a set of candidate sketches. This in turn specifies the sketch of the intended high-fidelity model and the associated data. We provide theoretical analyses of this bi-fidelity boosting (BFB) approach and discuss the conditions the low- and high-fidelity data must satisfy for a successful boosting. In doing so, we derive a bound on the residual norm of the BFB sketched solution relating it to its ideal, but computationally expensive, high-fidelity boosted counterpart. Empirical results on both manufactured and PDE data corroborate the theoretical analyses and illustrate the efficacy of the BFB solution in reducing the regression error, as compared to the non-boosted solution.
