Table of Contents
Fetching ...

Sparse Probabilistic Richardson Extrapolation

Chris. J. Oates, Richard Howey, Toni Karvonen

Abstract

Almost every numerical task can be cast as extrapolation with respect to the fidelity or tolerance parameters of a consistent numerical method. This perspective enables probabilistic uncertainty quantification and optimal experimental design functionality to be deployed, and also unlocks the potential for the convergence of numerical methods to be accelerated. Recent work established Probabilistic Richardson Extrapolation as a proof-of-concept, demonstrating how parallel multi-fidelity simulation can be used to accelerate simulation from a whole-heart model. However, the number of simulations was required to increase super-exponentially in $d$, the number of tolerance parameters appearing in the numerical method. This paper develops a refined notion of 'extrapolation dimension', drastically reducing this simulation requirement when multiple tolerance parameters feature in the numerical method. Sparsity-exploiting methodology is developed that is simultaneously simpler and more powerful compared to earlier work, and this is accompanied by sharp theoretical guarantees and substantial empirical support.

Sparse Probabilistic Richardson Extrapolation

Abstract

Almost every numerical task can be cast as extrapolation with respect to the fidelity or tolerance parameters of a consistent numerical method. This perspective enables probabilistic uncertainty quantification and optimal experimental design functionality to be deployed, and also unlocks the potential for the convergence of numerical methods to be accelerated. Recent work established Probabilistic Richardson Extrapolation as a proof-of-concept, demonstrating how parallel multi-fidelity simulation can be used to accelerate simulation from a whole-heart model. However, the number of simulations was required to increase super-exponentially in , the number of tolerance parameters appearing in the numerical method. This paper develops a refined notion of 'extrapolation dimension', drastically reducing this simulation requirement when multiple tolerance parameters feature in the numerical method. Sparsity-exploiting methodology is developed that is simultaneously simpler and more powerful compared to earlier work, and this is accompanied by sharp theoretical guarantees and substantial empirical support.

Paper Structure

This paper contains 61 sections, 11 theorems, 74 equations, 24 figures.

Key Result

Theorem 1

Let $\mathcal{X} = [\mathbf{0},\bm{1}] \subset \mathbb{R}^d$ and let $X_n = \{\mathbf{x}_i\}_{i=1}^n \subset \mathcal{X}$ be $\mathcal{P}_A$-unisolvent where $n = \mathrm{dim}(A)$. Let $\mathcal{X}_h = [\mathbf{0}, h \mathbf{1}]$ and $X_n^h = \{ h \mathbf{x}_i\}_{i=1}^n$ for $h \in (0,1]$. Let $f_n^ meaning that $f_n^h(\mathbf{0})$ converges faster than the original simulator output $f(\mathbf{x})

Figures (24)

  • Figure 1: Illustration of extrapolation methods to predict $f(\mathbf{0})$ from data $\{f(\mathbf{x}_i)\}$ on the multivariate ($d=2$) extrapolation task described in \ref{['subsec: cubature']}. Here the extrapolation dimension was $d_{\mathrm{ext}}(f) = 3$. Left: Classical MRE requires a dataset of size $n_{\min} = d_{\mathrm{ext}}(f)$, to which a polynomial is fitted. Middle: GRE requires a dataset of size $n_{\min} \geq (2^d d!)^d = 64$ to guarantee convergence acceleration, and achieves this by fitting a numerical analysis-informed GP. Right: SPRE applied to a dataset of size $n_{\min} = 1 + d_{\mathrm{ext}}(f)$, where the additional data point enables kernel parameters to be estimated using leave-one-out cross-validation, as described in \ref{['sec: UQ']}.
  • Figure 2: Convergence in the $d$-dimensional cubature setting of \ref{['subsec: cubature']}. Here $s \in \{0,1\}$ controls the smoothness of the extrapolation task; for $s = 0$ there is no smoothness to exploit, while for $s = 1$ there is smoothness that can be exploited. The absolute error is displayed for MRE, GRE and SPRE. As a baseline, we also consider the estimator $f(\mathbf{x})$ where $\mathbf{x}$ is the element closest to $\mathbf{0}$ in the dataset $X_n^h$. The relative error \ref{['eq: rel error']} is displayed for GRE and SPRE, where the shaded region represents the density of a standard normal. The full experimental protocol is described in \ref{['app: cubature illustration']}.
  • Figure 3: Illustrating iterative learning of $(A,k)$ and experimental design for $\{\mathbf{x}_i\}$. An initial design (hollow circles) is used to estimate the index set $A$ and the kernel $k$; here the white noise kernel $k(\mathbf{x},\mathbf{x}') = \sigma^2 \delta_{\mathbf{x},\mathbf{x}'}$ was used and the parameter $\sigma$ is reported. Experimental design is then used to select new design points (filled circles) which are added to the existing design set (hollow circles). The cost function was $c(\mathbf{x}) = \mathbf{x}^{-\mathbf{1}}$ and the true index set $A$ of maximal size, which for this example is $\{(0,0),(1,0),(0,1),(2,0)\}$, is correctly learned.
  • Figure 4: Two Spheres 3D Model. Three frames are displayed from a 3D model simulation with two spheres at times $t = 0$ (left), $t = 0.3$ (middle) and $t = 5$ (right). The final distance from the origin of the red sphere (closest to viewer) is treated as our quantity of interest. (A video is available on the GitHub repository.)
  • Figure 5: Two Spheres 3D Model. Absolute errors of estimates for $f(\mathbf{0})$ as a function of the scaling factor $h$ controlling the design set $X_n^h$. The $n=2^d$ raw estimates $\{f(\mathbf{x}) : \mathbf{x} \in X_n^h \}$ are also displayed.
  • ...and 19 more figures

Theorems & Definitions (28)

  • Definition 1: Extrapolation dimension
  • Example 1: Extrapolation sparsity for a simple cubature method
  • Theorem 1: Convergence acceleration for MRE
  • Theorem 2: Convergence acceleration for GRE
  • Remark 1: Extrapolation sparsity and SPRE
  • Remark 2: $\mathcal{P}_A$ exactness of SPRE
  • Remark 3: MRE as a special case of SPRE
  • Remark 4: Comparing GRE and SPRE
  • Theorem 3: Convergence acceleration for SPRE
  • Theorem 4: Further convergence acceleration for SPRE
  • ...and 18 more