Discretization Error of Fourier Neural Operators

Samuel Lanthaler; Andrew M. Stuart; Margaret Trautner

Discretization Error of Fourier Neural Operators

Samuel Lanthaler, Andrew M. Stuart, Margaret Trautner

TL;DR

This work quantifies the discretization error arising from grid-based implementations of Fourier Neural Operators, providing an $O(N^{-s})$ convergence rate that depends on input Sobolev regularity and persists through network layers. It decomposes total error into discretization and model discrepancy, showing that the discretization component can be tightly bounded and analyzed alongside the continuum FNO. The authors validate the theory with extensive numerical experiments, compare smooth versus non-smooth activations, and demonstrate that periodic positional encodings and smooth activations preserve regularity, improving convergence. An adaptive subsampling strategy is proposed to accelerate training by exploiting the discretization-model error decomposition. Overall, the paper offers both theoretical and practical guidance for efficiently training FNOs on discretized grids in PDE-related applications.

Abstract

Operator learning is a variant of machine learning that is designed to approximate maps between function spaces from data. The Fourier Neural Operator (FNO) is one of the main model architectures used for operator learning. The FNO combines linear and nonlinear operations in physical space with linear operations in Fourier space, leading to a parameterized map acting between function spaces. Although in definition, FNOs are objects in continuous space and perform convolutions on a continuum, their implementation is a discretized object performing computations on a grid, allowing efficient implementation via the FFT. Thus, there is a discretization error between the continuum FNO definition and the discretized object used in practice that is separate from other previously analyzed sources of model error. We examine this discretization error here and obtain algebraic rates of convergence in terms of the grid resolution as a function of the input regularity. Numerical experiments that validate the theory and describe model stability are performed. In addition, an algorithm is presented that leverages the discretization error and model error decomposition to optimize computational training time.

Discretization Error of Fourier Neural Operators

TL;DR

This work quantifies the discretization error arising from grid-based implementations of Fourier Neural Operators, providing an

convergence rate that depends on input Sobolev regularity and persists through network layers. It decomposes total error into discretization and model discrepancy, showing that the discretization component can be tightly bounded and analyzed alongside the continuum FNO. The authors validate the theory with extensive numerical experiments, compare smooth versus non-smooth activations, and demonstrate that periodic positional encodings and smooth activations preserve regularity, improving convergence. An adaptive subsampling strategy is proposed to accelerate training by exploiting the discretization-model error decomposition. Overall, the paper offers both theoretical and practical guidance for efficiently training FNOs on discretized grids in PDE-related applications.

Abstract

Paper Structure (22 sections, 14 theorems, 102 equations, 10 figures)

This paper contains 22 sections, 14 theorems, 102 equations, 10 figures.

Introduction
Contributions
Related work
Notation
Main results
Numerical experiments
Experiments with random weights
Discretization error for random weights models
Experiments with trained networks
Example 1: PDE solution model
Example 2: gradient map
Speeding up training via adaptive subsampling
Conclusions
Trigonometric interpolation and aliasing
Discretization error derivation
...and 7 more sections

Key Result

theorem 3.2

Let Assumptions asst:main hold. Let $\mathcal{A}_c$ be a compact set in $\mathcal{A}$. Let $v_t(a)\coloneqq \mathsf{L}_{t-1}\circ\dots \circ \mathsf{L}_0\circ \mathcal{P}(a)$ with $\mathcal{P}$ and each $\mathsf{L}$ as defined in Definition def:FNO. Similarly, let $v_t^N(a)\coloneqq \mathsf{L}^N_{t- where the constant C depends on $B,M,d,s,t,$ and $\mathcal{A}_c$.

Figures (10)

Figure 1: Relative error versus $N$ and $s$ for an FNO with default weight initialization.
Figure 2: Relative error versus $N$ and $s$ for a default FNO with a ReLU activation.
Figure 3: Relative error versus $N$ and $s$ for a default FNO with non-periodic position encoding appended to the input.
Figure 4: Visualization of the input and output data for the trained model examples.
Figure 5: Error versus discretization for inputs of varying regularity for the FNO trained on data corresponding to a PDE solution.
...and 5 more figures

Theorems & Definitions (29)

definition 2.1: Fourier Neural Operator
theorem 3.2
theorem 3.3
remark 3.4
Lemma 3.4
lemma A.1
proof
lemma A.2
proof
lemma A.3
...and 19 more

Discretization Error of Fourier Neural Operators

TL;DR

Abstract

Discretization Error of Fourier Neural Operators

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (29)