Table of Contents
Fetching ...

Linearization Turns Neural Operators into Function-Valued Gaussian Processes

Emilia Magnani, Marvin Pförtner, Tobias Weber, Philipp Hennig

TL;DR

The paper introduces LUNO, a practical framework for uncertainty quantification in neural operators by linearizing a Gaussian weight belief to produce a function-valued Gaussian process over the operator outputs. It establishes a theoretical link, via probabilistic currying, between Banach-space–valued function processes and augmented-input Gaussian processes, enabling post-hoc, scalable uncertainty without retraining. The method is demonstrated on Fourier neural operators (FNOs) and specializes to a computable last-layer Laplace variant (LUNO-LA), showing improved calibration and predictive performance in low-data and out-of-distribution scenarios. This work enables principled, resolution-agnostic uncertainty for operator learning with potential impact on safe scientific computing and active learning in PDE contexts.

Abstract

Neural operators generalize neural networks to learn mappings between function spaces from data. They are commonly used to learn solution operators of parametric partial differential equations (PDEs) or propagators of time-dependent PDEs. However, to make them useful in high-stakes simulation scenarios, their inherent predictive error must be quantified reliably. We introduce LUNO, a novel framework for approximate Bayesian uncertainty quantification in trained neural operators. Our approach leverages model linearization to push (Gaussian) weight-space uncertainty forward to the neural operator's predictions. We show that this can be interpreted as a probabilistic version of the concept of currying from functional programming, yielding a function-valued (Gaussian) random process belief. Our framework provides a practical yet theoretically sound way to apply existing Bayesian deep learning methods such as the linearized Laplace approximation to neural operators. Just as the underlying neural operator, our approach is resolution-agnostic by design. The method adds minimal prediction overhead, can be applied post-hoc without retraining the network, and scales to large models and datasets. We evaluate these aspects in a case study on Fourier neural operators.

Linearization Turns Neural Operators into Function-Valued Gaussian Processes

TL;DR

The paper introduces LUNO, a practical framework for uncertainty quantification in neural operators by linearizing a Gaussian weight belief to produce a function-valued Gaussian process over the operator outputs. It establishes a theoretical link, via probabilistic currying, between Banach-space–valued function processes and augmented-input Gaussian processes, enabling post-hoc, scalable uncertainty without retraining. The method is demonstrated on Fourier neural operators (FNOs) and specializes to a computable last-layer Laplace variant (LUNO-LA), showing improved calibration and predictive performance in low-data and out-of-distribution scenarios. This work enables principled, resolution-agnostic uncertainty for operator learning with potential impact on safe scientific computing and active learning in PDE contexts.

Abstract

Neural operators generalize neural networks to learn mappings between function spaces from data. They are commonly used to learn solution operators of parametric partial differential equations (PDEs) or propagators of time-dependent PDEs. However, to make them useful in high-stakes simulation scenarios, their inherent predictive error must be quantified reliably. We introduce LUNO, a novel framework for approximate Bayesian uncertainty quantification in trained neural operators. Our approach leverages model linearization to push (Gaussian) weight-space uncertainty forward to the neural operator's predictions. We show that this can be interpreted as a probabilistic version of the concept of currying from functional programming, yielding a function-valued (Gaussian) random process belief. Our framework provides a practical yet theoretically sound way to apply existing Bayesian deep learning methods such as the linearized Laplace approximation to neural operators. Just as the underlying neural operator, our approach is resolution-agnostic by design. The method adds minimal prediction overhead, can be applied post-hoc without retraining the network, and scales to large models and datasets. We evaluate these aspects in a case study on Fourier neural operators.
Paper Structure (52 sections, 10 theorems, 53 equations, 7 figures, 12 tables)

This paper contains 52 sections, 10 theorems, 53 equations, 7 figures, 12 tables.

Key Result

Lemma 2.1

Let $(\Omega, \mathcal{A}, \ifblank{}{\mathrm{P}}{\@prob*{}})$ be a probability space, ${\bm{\mathrm{f}}} \colon {\mathbb{A}} \times \Omega \to \mathbb{R}^{d'}$, $\mathbb{I} = \set{1, \dotsc, d'}$, and ${\mathrm{f}} \colon ({\mathbb{A}} \times \mathbb{I}) \times \Omega \to \mathbb{R}$ with $({\bm{\m as well as, for all $a_1, a_2 \in {\mathbb{A}}$ and $i, j \in \mathbb{I}$,

Figures (7)

  • Figure 1: Illustration of the steps involved in LUNO. A trained neural operator ${\bm{F}}$ (top left) is converted into an equivalent neural network ${\bm{f}}$ with outputs in $\mathbb{R}^{d_{\mathbb{U}}'}$ using (reverse) currying (top right). Linearizing ${\bm{f}}$ around the mean of the Gaussian weight belief results in a Gaussian process posterior ${\bm{\mathrm{f}}}$ quantifying the uncertainty about the function learned by ${\bm{f}}$ (bottom right). Finally, probabilistic currying transforms ${\bm{\mathrm{f}}}$ into a function-valued Gaussian process posterior $\boldsymbol{\mathrm{F}}$ over the operator learned by the neural operator ${\bm{F}}$ (bottom left).
  • Figure 2: FNO predictive uncertainty quantified by several different methods. Top row: target function (), mean () and 1.96 standard deviations () of, as well as samples () from, the predictive belief. For the ensemble, the samples are four of the ensemble members. Bottom row: spread of the predictive distribution around the mean. For the sample-/ensemble-based methods, we construct a Gaussian distribution from the empirical covariance matrix and draw four samples (). We plot 1.96 standard deviations () of the predictive belief, as well as the top-three eigenfunctions () and a heatmap of the predictive covariance matrix (top right corner of panels).
  • Figure 3: Comparing an ensemble (left), LUNO-LA (right). Top row shows target, residuals, and the predictive standard deviation. Bottom row shows the absolute ratio of the pointwise residual and the predictive standard deviation as well as a sample from the predictive belief. Since the uncertainty structure of the ensemble prediction is of low rank, we also include its unexplained error by projecting the residual vector onto the null space of the predictive covariance.
  • Figure 4: Averaged performance of different UQ methods on an autoregressive rollout of the FNO on 50 trajectories from the Pos-Neg-Flip dataset. We compare input perturbations (), deep ensembles (), Sample-Iso (), LUNO-Iso (), Sample-LA (), LUNO-LA ().
  • Figure 5: Initial condition and three-time steps of a single trajectory per generated dataset (Base, Flip, Pos, Pos-Neg, Pos-Neg-Flip).
  • ...and 2 more figures

Theorems & Definitions (35)

  • example 2.1: Fourier Neural Operators
  • Lemma 2.1
  • Definition 3.1: Banach-Valued Gaussian Process
  • Theorem 3.2: Probabilistic Currying in Banach Spaces; proof in \ref{['sec:bvgp']}
  • example 3.1: Currying a Continuous Bivariate Gaussian Process
  • Remark A.1: The Bidual Embedding
  • Definition A.2: Mean and Covariance Operator Bogachev1998GaussianMeasures
  • Definition A.3: Cross-Covariance Operator
  • Definition A.4: Gaussian Measure Bogachev1998GaussianMeasures
  • Remark A.5: Jointly Gaussian Measures
  • ...and 25 more