Table of Contents
Fetching ...

Dimension reduction for derivative-informed operator learning: An analysis of approximation errors

Dingcheng Luo, Thomas O'Leary-Roseberry, Peng Chen, Omar Ghattas

TL;DR

The approximation errors of neural operators in Sobolev norms over infinite-dimensional Gaussian input measures are analyzed to derive bounds for errors arising from both the dimension reduction and the latent neural network approximation, including the sampling errors associated with the empirical estimation of the PCA/DIS.

Abstract

We study the derivative-informed learning of nonlinear operators between infinite-dimensional separable Hilbert spaces by neural networks. Such operators can arise from the solution of partial differential equations (PDEs), and are used in many simulation-based outer-loop tasks in science and engineering, such as PDE-constrained optimization, Bayesian inverse problems, and optimal experimental design. In these settings, the neural network approximations can be used as surrogate models to accelerate the solution of the outer-loop tasks. However, since outer-loop tasks in infinite dimensions often require knowledge of the underlying geometry, the approximation accuracy of the operator's derivatives can also significantly impact the performance of the surrogate model. Motivated by this, we analyze the approximation errors of neural operators in Sobolev norms over infinite-dimensional Gaussian input measures. We focus on the reduced basis neural operator (RBNO), which uses linear encoders and decoders defined on dominant input/output subspaces spanned by reduced sets of orthonormal bases. To this end, we study two methods for generating the bases; principal component analysis (PCA) and derivative-informed subspaces (DIS), which use the dominant eigenvectors of the covariance of the data or the derivatives as the reduced bases, respectively. We then derive bounds for errors arising from both the dimension reduction and the latent neural network approximation, including the sampling errors associated with the empirical estimation of the PCA/DIS. Our analysis is validated on numerical experiments with elliptic PDEs, where our results show that bases informed by the map (i.e., DIS or output PCA) yield accurate reconstructions and generalization errors for both the operator and its derivatives, while input PCA may underperform unless ranks and training sample sizes are sufficiently large.

Dimension reduction for derivative-informed operator learning: An analysis of approximation errors

TL;DR

The approximation errors of neural operators in Sobolev norms over infinite-dimensional Gaussian input measures are analyzed to derive bounds for errors arising from both the dimension reduction and the latent neural network approximation, including the sampling errors associated with the empirical estimation of the PCA/DIS.

Abstract

We study the derivative-informed learning of nonlinear operators between infinite-dimensional separable Hilbert spaces by neural networks. Such operators can arise from the solution of partial differential equations (PDEs), and are used in many simulation-based outer-loop tasks in science and engineering, such as PDE-constrained optimization, Bayesian inverse problems, and optimal experimental design. In these settings, the neural network approximations can be used as surrogate models to accelerate the solution of the outer-loop tasks. However, since outer-loop tasks in infinite dimensions often require knowledge of the underlying geometry, the approximation accuracy of the operator's derivatives can also significantly impact the performance of the surrogate model. Motivated by this, we analyze the approximation errors of neural operators in Sobolev norms over infinite-dimensional Gaussian input measures. We focus on the reduced basis neural operator (RBNO), which uses linear encoders and decoders defined on dominant input/output subspaces spanned by reduced sets of orthonormal bases. To this end, we study two methods for generating the bases; principal component analysis (PCA) and derivative-informed subspaces (DIS), which use the dominant eigenvectors of the covariance of the data or the derivatives as the reduced bases, respectively. We then derive bounds for errors arising from both the dimension reduction and the latent neural network approximation, including the sampling errors associated with the empirical estimation of the PCA/DIS. Our analysis is validated on numerical experiments with elliptic PDEs, where our results show that bases informed by the map (i.e., DIS or output PCA) yield accurate reconstructions and generalization errors for both the operator and its derivatives, while input PCA may underperform unless ranks and training sample sizes are sufficiently large.

Paper Structure

This paper contains 60 sections, 39 theorems, 289 equations, 10 figures, 2 tables.

Key Result

Theorem 2

Suppose $g \in C_{\nu}^{m, p}(\mathbb{R}^{d_\mathrm{in}}, \mathbb{R}^{d_\mathrm{out}})$ for some $m \geq 0$ and $\nu$ is a finite measure on $\mathbb{R}^{d_\mathrm{in}}$. Let $p \in [1, \infty)$, $\psi \in \mathcal{A}^{\infty}_{b}$, and $d_L \geq 2$. Then, for any $\epsilon > 0$, there exists a $d_L

Figures (10)

  • Figure 1: First 200 eigenvalues corresponding to the PCA and DIS bases for the three PDE problems considered. These are computed using a large sample size of $N$ = 20,000, which we take as a proxy for the exact PCA/DIS.
  • Figure 2: Samples of input-output pairs along with the 1st, 2nd, 3rd, 5th, and 9th PCA/DIS basis vectors for the 1D PDE examples. Top: semilinear elliptic PDE. Bottom: steady Burgers equation. Note the localization of the input DIS basis vectors to the region $s \leq 0.5$ in the semilinear elliptic PDE example where the diffusion coefficient is smallest, and to the region near $s = 0.4$ that is the center of the Gaussian profile scaling the source in the Burgers example.
  • Figure 3: Top: a sample of an input-output pair for the linear elasticity problem. Bottom: the 1st, 2nd, 3rd, 5th, and 9th PCA/DIS basis vectors for the input and output of the linear elasticity problem.
  • Figure 4: Reconstruction errors (normalized by the squared second moments of the test data) in the output and its derivatives for the three PDE problems. Left column: reconstruction error of the output by output PCA/DIS. Center column: reconstruction error of the derivative by the output PCA/DIS. Right column: reconstruction error of the derivative by the input PCA/DIS. Bounds for the reconstruction error in terms of the trailing eigenvalues are shown in the dashed lines. Here, true bounds are shown for the output PCA/DIS in output reconstruction and input/output DIS for derivative reconstruction cases. For the reconstruction of the derivatives by input/output PCA basis, we do not explicitly compute the constant, but instead the plot shows the line corresponding to $K_1 \sum_{i=r+1}^{\infty} \lambda_i^{\mathrm{PCA}}$ and $K_2 \sum_{i=r+1}^{\infty} \mu_i^{\mathrm{PCA}}$ for some prescribed $K_1, K_2$ to illustrate the decay rate of the reconstruction error bound.
  • Figure 5: Excess risks in the output and derivative reconstruction errors for the semilinear elliptic PDE problem for ranks $r= 10$ and $r = 30$. Values are normalized by the squared second moments of the test data. For reference, the global $N^{-1/2}$ rate is shown in the dotted line while the local $N^{-1}$ rate is shown in the dashed line. The filled regions correspond to the 10%--90% quantile ranges of the 10 independent runs.
  • ...and 5 more figures

Theorems & Definitions (85)

  • Definition 1
  • Theorem 2: Deep universal approximation with smooth, ReLU like activation functions
  • Remark 3: Comment on notation
  • Remark 4
  • Proposition 5
  • proof
  • Theorem 6: Generic universal approximation
  • Theorem 10: Detailed universal approximation
  • Remark 11
  • Remark 12
  • ...and 75 more