Table of Contents
Fetching ...

Analysis of Fourier Neural Operators via Effective Field Theory

Taeyoung Kim

TL;DR

The results quantify how nonlinearity enables neural operators to capture non-trivial features, supply criteria for hyperparameter selection via criticality analysis, and explain why scale invariant activations and residual connections enhance feature learning in FNOs.

Abstract

Fourier Neural Operators (FNOs) have emerged as leading surrogates for solver operators for various functional problems, yet their stability, generalization and frequency behavior lack a principled explanation. We present a systematic effective field theory analysis of FNOs in an infinite-dimensional function space, deriving closed recursion relations for the layer kernel and four-point vertex and then examining three practically important settings-analytic activations, scale-invariant cases and architectures with residual connections. The theory shows that nonlinear activations inevitably couple frequency inputs to high frequency modes that are otherwise discarded by spectral truncation, and experiments confirm this frequency transfer. For wide networks, we derive explicit criticality conditions on the weight initialization ensemble that ensure small input perturbations maintain a uniform scale across depth, and we confirm experimentally that the theoretically predicted ratio of kernel perturbations matches the measurements. Taken together, our results quantify how nonlinearity enables neural operators to capture non-trivial features, supply criteria for hyperparameter selection via criticality analysis, and explain why scale-invariant activations and residual connections enhance feature learning in FNOs. Finally, we translate the criticality theory into a practical criterion-matched initialization (calibration) procedure; on a standard PDEBench Burgers benchmark, the calibrated FNO exhibits markedly more stable optimization, faster convergence, and improved test error relative to a vanilla FNO.

Analysis of Fourier Neural Operators via Effective Field Theory

TL;DR

The results quantify how nonlinearity enables neural operators to capture non-trivial features, supply criteria for hyperparameter selection via criticality analysis, and explain why scale invariant activations and residual connections enhance feature learning in FNOs.

Abstract

Fourier Neural Operators (FNOs) have emerged as leading surrogates for solver operators for various functional problems, yet their stability, generalization and frequency behavior lack a principled explanation. We present a systematic effective field theory analysis of FNOs in an infinite-dimensional function space, deriving closed recursion relations for the layer kernel and four-point vertex and then examining three practically important settings-analytic activations, scale-invariant cases and architectures with residual connections. The theory shows that nonlinear activations inevitably couple frequency inputs to high frequency modes that are otherwise discarded by spectral truncation, and experiments confirm this frequency transfer. For wide networks, we derive explicit criticality conditions on the weight initialization ensemble that ensure small input perturbations maintain a uniform scale across depth, and we confirm experimentally that the theoretically predicted ratio of kernel perturbations matches the measurements. Taken together, our results quantify how nonlinearity enables neural operators to capture non-trivial features, supply criteria for hyperparameter selection via criticality analysis, and explain why scale-invariant activations and residual connections enhance feature learning in FNOs. Finally, we translate the criticality theory into a practical criterion-matched initialization (calibration) procedure; on a standard PDEBench Burgers benchmark, the calibrated FNO exhibits markedly more stable optimization, faster convergence, and improved test error relative to a vanilla FNO.

Paper Structure

This paper contains 48 sections, 102 equations, 12 figures, 1 table, 1 algorithm.

Figures (12)

  • Figure 1: Schematic overview of the paper: (top) the truncated FNO architecture; (middle) nonlinearity-induced frequency transfer and the EFT description via the kernel/vertex hierarchy; (bottom) closed-form results in three regimes and their experimental validation and the effectiveness of calibration based on our theory.
  • Figure 2: Reduced kernel (log scale) quadratic, no residuals, widths $n\in\{4,64\}$ at depths $l=2$ (top) and $l=4$ (bottom).
  • Figure 3: Reduced kernel (log scale) tanh, no residuals, widths $n\in\{4,64\}$ at depths $l=2$ (top) and $l=4$ (bottom).
  • Figure 4: Reduced kernel (log scale) ReLU, no residuals, widths $n\in\{4,64\}$ at depths $l=2$ (top) and $l=4$ (bottom).
  • Figure 5: Reduced kernel (log scale) tanh, with residuals, width $n=32$ with $\gamma=0.1,0.9$ at $l=2$ (top) and $l=6$ (bottom).
  • ...and 7 more figures

Theorems & Definitions (4)

  • proof
  • proof
  • proof
  • proof