Table of Contents
Fetching ...

Statistical Learning Theory for Neural Operators

Niklas Reinhardt, Sven Wang, Jakob Zech

TL;DR

This work develops statistical convergence theory for learning operators between infinite-dimensional Hilbert spaces from finite noisy data, extending classical nonparametric regression to neural operator learning. It formulates empirical risk minimization over operator classes, establishes high-probability bounds via metric entropy and chaining techniques, and applies the theory to FrameNet, an encoder-decoder neural-operator architecture built on frames. For holomorphic operators, FrameNet achieves algebraic, dimension-free convergence rates in the sample size, and the results are instantiated in a prototypical parametric elliptic PDE (Darcy flow) setting to illustrate broad applicability. By unifying M-estimation, approximation theory, and neural-operator design, the paper provides principled, sample-efficient guarantees for learning solution and coefficient-to-solution maps in PDE contexts.

Abstract

We present statistical convergence results for the learning of (possibly) non-linear mappings in infinite-dimensional spaces. Specifically, given a map $G_0:\mathcal X\to\mathcal Y$ between two separable Hilbert spaces, we analyze the problem of recovering $G_0$ from $n\in\mathbb N$ noisy input-output pairs $(x_i, y_i)_{i=1}^n$ with $y_i = G_0 (x_i)+\varepsilon_i$; here the $x_i\in\mathcal X$ represent randomly drawn 'design' points, and the $\varepsilon_i$ are assumed to be either i.i.d. white noise processes or subgaussian random variables in $\mathcal{Y}$. We provide general convergence results for least-squares-type empirical risk minimizers over compact regression classes $\mathbf G\subseteq L^\infty(X,Y)$, in terms of their approximation properties and metric entropy bounds, which are derived using empirical process techniques. This generalizes classical results from finite-dimensional nonparametric regression to an infinite-dimensional setting. As a concrete application, we study an encoder-decoder based neural operator architecture termed FrameNet. Assuming $G_0$ to be holomorphic, we prove algebraic (in the sample size $n$) convergence rates in this setting, thereby overcoming the curse of dimensionality. To illustrate the wide applicability, as a prototypical example we discuss the learning of the non-linear solution operator to a parametric elliptic partial differential equation.

Statistical Learning Theory for Neural Operators

TL;DR

This work develops statistical convergence theory for learning operators between infinite-dimensional Hilbert spaces from finite noisy data, extending classical nonparametric regression to neural operator learning. It formulates empirical risk minimization over operator classes, establishes high-probability bounds via metric entropy and chaining techniques, and applies the theory to FrameNet, an encoder-decoder neural-operator architecture built on frames. For holomorphic operators, FrameNet achieves algebraic, dimension-free convergence rates in the sample size, and the results are instantiated in a prototypical parametric elliptic PDE (Darcy flow) setting to illustrate broad applicability. By unifying M-estimation, approximation theory, and neural-operator design, the paper provides principled, sample-efficient guarantees for learning solution and coefficient-to-solution maps in PDE contexts.

Abstract

We present statistical convergence results for the learning of (possibly) non-linear mappings in infinite-dimensional spaces. Specifically, given a map between two separable Hilbert spaces, we analyze the problem of recovering from noisy input-output pairs with ; here the represent randomly drawn 'design' points, and the are assumed to be either i.i.d. white noise processes or subgaussian random variables in . We provide general convergence results for least-squares-type empirical risk minimizers over compact regression classes , in terms of their approximation properties and metric entropy bounds, which are derived using empirical process techniques. This generalizes classical results from finite-dimensional nonparametric regression to an infinite-dimensional setting. As a concrete application, we study an encoder-decoder based neural operator architecture termed FrameNet. Assuming to be holomorphic, we prove algebraic (in the sample size ) convergence rates in this setting, thereby overcoming the curse of dimensionality. To illustrate the wide applicability, as a prototypical example we discuss the learning of the non-linear solution operator to a parametric elliptic partial differential equation.

Paper Structure

This paper contains 47 sections, 39 theorems, 292 equations.

Key Result

Theorem 1.1

Consider the operator $G_0$ from the Darcy problem on the $d$-dimensional torus ${\mathbb{T}}^d$ ($d\ge 2$), and suppose that $\gamma$ satisfies supp-gamma for some $\mathfrak{s} > 3d/2+1$ and $a_{min}>0$. Fix $\tau>0$ (arbitrarily small). Then there exists a constant $C$ such that for each $n\in\ma

Theorems & Definitions (83)

  • Theorem 1.1: Informal
  • Remark 2.1
  • Example 2.2
  • Remark 2.3: Connection to maximum likelihood
  • Remark 2.4
  • Theorem 2.5
  • Theorem 2.6: $L^2(\gamma)$-Concentration under Random Design
  • Corollary 2.7: $L^2(\gamma)$-Mean Squared Error
  • Corollary 2.8
  • Remark 2.9: Effective smoothness
  • ...and 73 more