Table of Contents
Fetching ...

Generative Neural Operators through Diffusion Last Layer

Sungwon Park, Anthony Zhou, Hongjoong Kim, Amir Barati Farimani

TL;DR

This work addresses uncertainty quantification for neural operators by introducing the diffusion last layer (DLL), a lightweight head that upgrades any deterministic neural-operator backbone into a conditional generative surrogate. DLL learns an input-conditioned, low-rank KL representation of outputs and trains a diffusion model in coefficient space to sample from $p(x|a)$, preserving discretization invariance and enabling efficient uncertainty modeling. Across stochastic PDE benchmarks, DLL improves both distributional fidelity and calibration, while also enhancing long-horizon rollout stability in deterministic chaotic systems. The approach combines an operator encoder with a compatibility diffusion head, offering a practical, scalable path to uncertainty-aware operator learning with broad applicability to inverse problems and irregular geometries.

Abstract

Neural operators have emerged as a powerful paradigm for learning discretization-invariant function-to-function mappings in scientific computing. However, many practical systems are inherently stochastic, making principled uncertainty quantification essential for reliable deployment. To address this, we introduce a simple add-on, the diffusion last layer (DLL), a lightweight probabilistic head that can be attached to arbitrary neural operator backbones to model predictive uncertainty. Motivated by the relative smoothness and low-dimensional structure often exhibited by PDE solution distributions, DLL parameterizes the conditional output distribution directly in function space through a low-rank Karhunen-Loève expansion, enabling efficient and expressive uncertainty modeling. Across stochastic PDE operator learning benchmarks, DLL improves generalization and uncertainty-aware prediction. Moreover, even in deterministic long-horizon rollout settings, DLL enhances rollout stability and provides meaningful estimates of epistemic uncertainty for backbone neural operators.

Generative Neural Operators through Diffusion Last Layer

TL;DR

This work addresses uncertainty quantification for neural operators by introducing the diffusion last layer (DLL), a lightweight head that upgrades any deterministic neural-operator backbone into a conditional generative surrogate. DLL learns an input-conditioned, low-rank KL representation of outputs and trains a diffusion model in coefficient space to sample from , preserving discretization invariance and enabling efficient uncertainty modeling. Across stochastic PDE benchmarks, DLL improves both distributional fidelity and calibration, while also enhancing long-horizon rollout stability in deterministic chaotic systems. The approach combines an operator encoder with a compatibility diffusion head, offering a practical, scalable path to uncertainty-aware operator learning with broad applicability to inverse problems and irregular geometries.

Abstract

Neural operators have emerged as a powerful paradigm for learning discretization-invariant function-to-function mappings in scientific computing. However, many practical systems are inherently stochastic, making principled uncertainty quantification essential for reliable deployment. To address this, we introduce a simple add-on, the diffusion last layer (DLL), a lightweight probabilistic head that can be attached to arbitrary neural operator backbones to model predictive uncertainty. Motivated by the relative smoothness and low-dimensional structure often exhibited by PDE solution distributions, DLL parameterizes the conditional output distribution directly in function space through a low-rank Karhunen-Loève expansion, enabling efficient and expressive uncertainty modeling. Across stochastic PDE operator learning benchmarks, DLL improves generalization and uncertainty-aware prediction. Moreover, even in deterministic long-horizon rollout settings, DLL enhances rollout stability and provides meaningful estimates of epistemic uncertainty for backbone neural operators.
Paper Structure (53 sections, 9 theorems, 64 equations, 7 figures, 9 tables)

This paper contains 53 sections, 9 theorems, 64 equations, 7 figures, 9 tables.

Key Result

Proposition 2.3

Fix $c$. Under suitable regularity conditions, there exists a constant $C>0$ such that

Figures (7)

  • Figure 1: Deterministic vs. generative neural operators. A standard neural operator (top) maps an input field $a\in\mathcal{A}$ to a single prediction $u\in\mathcal{U}$. Our Diffusion Last Layer (bottom) turns the same backbone into a conditional generator by attaching a lightweight diffusion head, producing a full predictive distribution $p(u\,|\,a)$ over functions rather than a point estimate.
  • Figure 2: Training and inference pipeline for the Diffusion Last Layer (DLL).(1) Operator encoder. A NO backbone produces basis functions $\Phi(a)$ and a NF maps targets to coefficients $\xi=\mathtt{NF}(u)$, yielding the low-rank reconstruction $\hat{u}=\xi^\top\Phi(a)$. (2) DLL training. With the encoder frozen, we train a conditional diffusion model in coefficient space: an MLP denoiser predicts noise (or score) from $(a,x_t,t)$ using NO features of $a$ for conditioning. (3) DLL inference. For a new input $a$, we sample coefficients $x\sim p(x\mid a)$ by iterative denoising and decode $\hat{u}=x^\top\Phi(a)$.
  • Figure 3: Stochastic Burgers' equation. Columns compare the ground truth and different surrogate models. For each method, the top panel shows multiple realizations of the solution field $u(x)$ for a fixed input, illustrating sample diversity. The bottom panel shows the predictive mean (solid line) and an uncertainty band given by standard deviation (shaded region) estimated from the samples.
  • Figure 4: Stochastic Darcy flow. For each method, we generate conditional samples of the solution field and summarize them by the sample mean (top) and per-pixel sample standard deviation (bottom). This visualization highlights both accuracy of the central prediction and the spatial structure of predictive uncertainty.
  • Figure 5: KS equation. Long-horizon rollout comparison at rollout step 50. The black curve denotes the ground-truth solution, while the blue curve shows the predictive mean of each method. Shaded regions indicate predictive standard deviation estimated from generated samples.
  • ...and 2 more figures

Theorems & Definitions (18)

  • Proposition 2.3
  • Proposition 4.1: Optimal rank-$r$ reconstruction
  • Lemma 1.2: Differential inequality
  • proof
  • Proposition 1.3: Integrated stability bound
  • proof
  • proof : Proof of Proposition \ref{['prop:wasserstein_stability']}
  • Proposition 1.5: Conditional stability
  • proof
  • Lemma 1.7: Projection decomposition
  • ...and 8 more