Table of Contents
Fetching ...

An operator learning perspective on parameter-to-observable maps

Daniel Zhengyu Huang, Nicholas H. Nelsen, Margaret Trautner

TL;DR

This work introduces Fourier Neural Mappings (FNMs), extending Fourier Neural Operators to handle finite-dimensional vector inputs/outputs in parameter-to-observable maps. It establishes universal-approximation guarantees for FNMs and develops a theoretical Bayesian framework to compare end-to-end versus full-field learning for linear functionals, revealing that full-field learning can be more data-efficient for smooth QoIs while end-to-end can excel for rough QoIs. Theoretical results are complemented by numerical experiments on nonlinear PtO maps (advection-diffusion, aerodynamic drag/lift, and homogenization) that corroborate the theory and demonstrate the practical value of learning in function spaces. Overall, the paper provides rigorous guidance on when to learn the forward operator versus the PtO map directly, with implications for data collection and surrogate modeling in scientific computing.

Abstract

Computationally efficient surrogates for parametrized physical models play a crucial role in science and engineering. Operator learning provides data-driven surrogates that map between function spaces. However, instead of full-field measurements, often the available data are only finite-dimensional parametrizations of model inputs or finite observables of model outputs. Building on Fourier Neural Operators, this paper introduces the Fourier Neural Mappings (FNMs) framework that is able to accommodate such finite-dimensional vector inputs or outputs. The paper develops universal approximation theorems for the method. Moreover, in many applications the underlying parameter-to-observable (PtO) map is defined implicitly through an infinite-dimensional operator, such as the solution operator of a partial differential equation. A natural question is whether it is more data-efficient to learn the PtO map end-to-end or first learn the solution operator and subsequently compute the observable from the full-field solution. A theoretical analysis of Bayesian nonparametric regression of linear functionals, which is of independent interest, suggests that the end-to-end approach can actually have worse sample complexity. Extending beyond the theory, numerical results for the FNM approximation of three nonlinear PtO maps demonstrate the benefits of the operator learning perspective that this paper adopts.

An operator learning perspective on parameter-to-observable maps

TL;DR

This work introduces Fourier Neural Mappings (FNMs), extending Fourier Neural Operators to handle finite-dimensional vector inputs/outputs in parameter-to-observable maps. It establishes universal-approximation guarantees for FNMs and develops a theoretical Bayesian framework to compare end-to-end versus full-field learning for linear functionals, revealing that full-field learning can be more data-efficient for smooth QoIs while end-to-end can excel for rough QoIs. Theoretical results are complemented by numerical experiments on nonlinear PtO maps (advection-diffusion, aerodynamic drag/lift, and homogenization) that corroborate the theory and demonstrate the practical value of learning in function spaces. Overall, the paper provides rigorous guidance on when to learn the forward operator versus the PtO map directly, with implications for data collection and surrogate modeling in scientific computing.

Abstract

Computationally efficient surrogates for parametrized physical models play a crucial role in science and engineering. Operator learning provides data-driven surrogates that map between function spaces. However, instead of full-field measurements, often the available data are only finite-dimensional parametrizations of model inputs or finite observables of model outputs. Building on Fourier Neural Operators, this paper introduces the Fourier Neural Mappings (FNMs) framework that is able to accommodate such finite-dimensional vector inputs or outputs. The paper develops universal approximation theorems for the method. Moreover, in many applications the underlying parameter-to-observable (PtO) map is defined implicitly through an infinite-dimensional operator, such as the solution operator of a partial differential equation. A natural question is whether it is more data-efficient to learn the PtO map end-to-end or first learn the solution operator and subsequently compute the observable from the full-field solution. A theoretical analysis of Bayesian nonparametric regression of linear functionals, which is of independent interest, suggests that the end-to-end approach can actually have worse sample complexity. Extending beyond the theory, numerical results for the FNM approximation of three nonlinear PtO maps demonstrate the benefits of the operator learning perspective that this paper adopts.
Paper Structure (37 sections, 35 theorems, 205 equations, 9 figures, 1 table)

This paper contains 37 sections, 35 theorems, 205 equations, 9 figures, 1 table.

Key Result

Theorem 2.2

Let $s\geq 0$, $\mathcal{D}\subset \mathbb{R}^d$ be an open Lipschitz domain such that $\overline{\mathcal{D}}\subset (0,1)^d$, and $\mathcal{U}=H^s(\mathcal{D};\mathbb{R}^{d_u})$. Let $\Psi^\dagger\colon \mathcal{U}\to\mathbb{R}^{d_y}$ be a continuous mapping. Let $K\subset \mathcal{U}$ be compact

Figures (9)

  • Figure 1: \ref{['item:ee']} vs. \ref{['item:ff']} convergence rate exponents \ref{['eqn:linear_compare_rateexp']} as a function of QoI regularity exponent $r$. Larger exponents imply faster convergence rates. As the curves gets lighter, $\alpha+\beta$, an indicator of the smoothness of the problem, increases. The vertical dashed line corresponds to $r=-1/2$, which is the transition point where \ref{['item:ee']} and \ref{['item:ff']} have the same rate and the onset of power law decay for the QoI coefficients begins.
  • Figure 2: Empirical sample complexity of the Bayesian \ref{['item:ee']} and \ref{['item:ff']} estimators for linear PtO maps based on a Poisson problem. The solid purple lines are best linear fits to the broken curves with markers, which correspond to numerically computed squared errors. In all three figures, the experimentally observed convergence rates are nearly perfect matches to those from the theoretical upper bounds in Corollary \ref{['cor:linear_compare']} (see Table \ref{['tab:rates_linear']}).
  • Figure 3: Visualization of the velocity-to-state map for the advection--diffusion model. Rows denote the dimension of the KL expansion of the velocity profile and columns display representative input and output fields.
  • Figure 4: Empirical sample complexity of FNM and NN architectures for the advection--diffusion PtO map (note that Figure \ref{['subfig:data_ad_d2']} has a different vertical axis range). The shaded regions denote two standard deviations away from the mean of the test error over $5$ realizations of the random training dataset indices, batch indices during SGD, and model parameter initializations.
  • Figure 5: Flow over an airfoil. From left to right: visualization of the cubic design element and different airfoil configurations, guided by the displacement field of the control nodes; a close-up view of the $C$-grid surrounding the airfoil; the physical domain discretized by the $C$-grid.
  • ...and 4 more figures

Theorems & Definitions (68)

  • Definition 1.1: Fourier Neural Mappings
  • Theorem 2.2: universal approximation: function-to-vector mappings
  • Theorem 2.3: universal approximation: vector-to-function mappings
  • Remark 3.1: equivalence to regularized empirical risk minimization
  • Remark 3.4: independence of the KL coefficients
  • Remark 3.7: examples of linear QoIs
  • Theorem 3.8: end-to-end learning: optimized convergence rate
  • Theorem 3.9: full-field learning: convergence rate for power law QoI
  • Remark 3.10: posterior contraction rates
  • Corollary 3.10: sample complexity comparison
  • ...and 58 more