Table of Contents
Fetching ...

The Parametric Complexity of Operator Learning

Samuel Lanthaler, Andrew M. Stuart

TL;DR

A novel neural operator architecture is introduced, termed HJ-Net, which explicitly takes into account characteristic information of the underlying Hamiltonian system, and can provably beat the curse of parametric complexity related to the infinite-dimensional input and output function spaces.

Abstract

Neural operator architectures employ neural networks to approximate operators mapping between Banach spaces of functions; they may be used to accelerate model evaluations via emulation, or to discover models from data. Consequently, the methodology has received increasing attention over recent years, giving rise to the rapidly growing field of operator learning. The first contribution of this paper is to prove that for general classes of operators which are characterized only by their $C^r$- or Lipschitz-regularity, operator learning suffers from a "curse of parametric complexity", which is an infinite-dimensional analogue of the well-known curse of dimensionality encountered in high-dimensional approximation problems. The result is applicable to a wide variety of existing neural operators, including PCA-Net, DeepONet and the FNO.The second contribution of the paper is to prove that this general curse can be overcome for solution operators defined by the Hamilton-Jacobi equation; this is achieved by leveraging additional structure in the underlying solution operator, going beyond regularity. To this end, a novel neural operator architecture is introduced, termed HJ-Net, which explicitly takes into account characteristic information of the underlying Hamiltonian system. Error and complexity estimates are derived for HJ-Net which show that this architecture can provably beat the curse of parametric complexity related to the infinite-dimensional input and output function spaces.

The Parametric Complexity of Operator Learning

TL;DR

A novel neural operator architecture is introduced, termed HJ-Net, which explicitly takes into account characteristic information of the underlying Hamiltonian system, and can provably beat the curse of parametric complexity related to the infinite-dimensional input and output function spaces.

Abstract

Neural operator architectures employ neural networks to approximate operators mapping between Banach spaces of functions; they may be used to accelerate model evaluations via emulation, or to discover models from data. Consequently, the methodology has received increasing attention over recent years, giving rise to the rapidly growing field of operator learning. The first contribution of this paper is to prove that for general classes of operators which are characterized only by their - or Lipschitz-regularity, operator learning suffers from a "curse of parametric complexity", which is an infinite-dimensional analogue of the well-known curse of dimensionality encountered in high-dimensional approximation problems. The result is applicable to a wide variety of existing neural operators, including PCA-Net, DeepONet and the FNO.The second contribution of the paper is to prove that this general curse can be overcome for solution operators defined by the Hamilton-Jacobi equation; this is achieved by leveraging additional structure in the underlying solution operator, going beyond regularity. To this end, a novel neural operator architecture is introduced, termed HJ-Net, which explicitly takes into account characteristic information of the underlying Hamiltonian system. Error and complexity estimates are derived for HJ-Net which show that this architecture can provably beat the curse of parametric complexity related to the infinite-dimensional input and output function spaces.
Paper Structure (51 sections, 30 theorems, 280 equations, 1 figure, 2 algorithms)

This paper contains 51 sections, 30 theorems, 280 equations, 1 figure, 2 algorithms.

Key Result

proposition 2.0

Let $r\in \mathbb{N}$ be given. For any dimension $D\in \mathbb{N}$, there exists $f_{D,r} \in C^r([0,1]^D;\mathbb{R})$ and constant $\overline{\epsilon},\gamma > 0$, such that any ReLU neural network $\Psi: \mathbb{R}^D \to \mathbb{R}$ achieving accuracy with $\epsilon \le \overline{\epsilon}$, has size at least $\mathrm{size}(\Psi) \ge \epsilon^{-\gamma D/r}$. The constant $\overline{\epsilon}

Figures (1)

  • Figure 1: Diagrammatic illustration of operator learning based on an encoding $\mathcal{E}$, a neural network $\Psi$, and a reconstruction $\mathcal{R}$.

Theorems & Definitions (79)

  • proposition 2.0: Neural Network CoD
  • example 2.1
  • definition 2.2
  • remark 2.3
  • remark 2.4
  • remark 2.5
  • lemma 2.6
  • definition 2.7: Functional of neural network-type
  • remark 2.8
  • definition 2.9: Operator of neural network-type
  • ...and 69 more