Table of Contents
Fetching ...

Performance of Neural and Polynomial Operator Surrogates

Josephine Westermann, Benno Huber, Thomas O'Leary-Roseberry, Jakob Zech

Abstract

We consider the problem of constructing surrogate operators for parameter-to-solution maps arising from parametric partial differential equations, where repeated forward model evaluations are computationally expensive. We present a systematic empirical comparison of neural operator surrogates, including a reduced-basis neural operator trained with $L^2_μ$ and $H^1_μ$ objectives and the Fourier neural operator, against polynomial surrogate methods, specifically a reduced-basis sparse-grid surrogate and a reduced-basis tensor-train surrogate. All methods are evaluated on a linear parametric diffusion problem and a nonlinear parametric hyperelasticity problem, using input fields with algebraically decaying spectral coefficients at varying rates of decay $s$. To enable fair comparisons, we analyze ensembles of surrogate models generated by varying hyperparameters and compare the resulting Pareto frontiers of cost versus approximation accuracy, decomposing cost into contributions from data generation, setup, and evaluation. Our results show that no single method is universally superior. Polynomial surrogates achieve substantially better data efficiency for smooth input fields ($s \geq 2$), with convergence rates for the sparse-grid surrogate in agreement with theoretical predictions. For rough inputs ($s \leq 1$), the Fourier neural operator displays the fastest convergence rates. Derivative-informed training consistently improves data efficiency over standard $L^2_μ$ training, providing a competitive alternative for rough inputs in the low-data regime when Jacobian information is available at reasonable cost. These findings highlight the importance of matching the surrogate methodology to the regularity of the problem as well as accuracy demands and computational constraints of the application.

Performance of Neural and Polynomial Operator Surrogates

Abstract

We consider the problem of constructing surrogate operators for parameter-to-solution maps arising from parametric partial differential equations, where repeated forward model evaluations are computationally expensive. We present a systematic empirical comparison of neural operator surrogates, including a reduced-basis neural operator trained with and objectives and the Fourier neural operator, against polynomial surrogate methods, specifically a reduced-basis sparse-grid surrogate and a reduced-basis tensor-train surrogate. All methods are evaluated on a linear parametric diffusion problem and a nonlinear parametric hyperelasticity problem, using input fields with algebraically decaying spectral coefficients at varying rates of decay . To enable fair comparisons, we analyze ensembles of surrogate models generated by varying hyperparameters and compare the resulting Pareto frontiers of cost versus approximation accuracy, decomposing cost into contributions from data generation, setup, and evaluation. Our results show that no single method is universally superior. Polynomial surrogates achieve substantially better data efficiency for smooth input fields (), with convergence rates for the sparse-grid surrogate in agreement with theoretical predictions. For rough inputs (), the Fourier neural operator displays the fastest convergence rates. Derivative-informed training consistently improves data efficiency over standard training, providing a competitive alternative for rough inputs in the low-data regime when Jacobian information is available at reasonable cost. These findings highlight the importance of matching the surrogate methodology to the regularity of the problem as well as accuracy demands and computational constraints of the application.

Paper Structure

This paper contains 67 sections, 4 theorems, 83 equations, 21 figures, 1 table.

Key Result

Theorem 5.2

Let asm:ops:holomorphy be satisfied with $s>1, t>0$. Fix $\delta>0$ (arbitrarily small). Then, there exists a constant $C>0$ such that for every $N \in {\mathbb N}$, there exists a ReLU NN $\tilde{g}$ of size $N$ such that $\blacktriangleleft$$\blacktriangleleft$

Figures (21)

  • Figure 1: Encoder-decoder architecture.
  • Figure 2: Input samples constructed as in \ref{['eq:samples']} using a fixed coefficient vector ${\bm c}$ and varying $s$ values. As $s$ decreases, the correlation length becomes smaller.
  • Figure 3: A random parameter field $x({\bm \xi})$ (left) and corresponding FE solution $y({\bm \xi})$ of the linear elliptic PDE \ref{['eq:elliptic']} (right).
  • Figure 4: 2D deformation problem
  • Figure 5: A random parameter field $x({\bm \xi})$ (left) and corresponding FE solution $y({\bm \xi})$ of the hyperelasticity problem (right).
  • ...and 16 more figures

Theorems & Definitions (8)

  • Remark 1
  • Example 1
  • Remark 2
  • Theorem 5.2: herrmann2024neural
  • Theorem 5.3: kovachki2021universal
  • Remark 3
  • Theorem 5.4: herrmann2024neural
  • Theorem 5.5: ttexpressionrate2026