Generative Neural Operators through Diffusion Last Layer

Sungwon Park; Anthony Zhou; Hongjoong Kim; Amir Barati Farimani

Generative Neural Operators through Diffusion Last Layer

Sungwon Park, Anthony Zhou, Hongjoong Kim, Amir Barati Farimani

TL;DR

This work addresses uncertainty quantification for neural operators by introducing the diffusion last layer (DLL), a lightweight head that upgrades any deterministic neural-operator backbone into a conditional generative surrogate. DLL learns an input-conditioned, low-rank KL representation of outputs and trains a diffusion model in coefficient space to sample from $p(x|a)$, preserving discretization invariance and enabling efficient uncertainty modeling. Across stochastic PDE benchmarks, DLL improves both distributional fidelity and calibration, while also enhancing long-horizon rollout stability in deterministic chaotic systems. The approach combines an operator encoder with a compatibility diffusion head, offering a practical, scalable path to uncertainty-aware operator learning with broad applicability to inverse problems and irregular geometries.

Abstract

Neural operators have emerged as a powerful paradigm for learning discretization-invariant function-to-function mappings in scientific computing. However, many practical systems are inherently stochastic, making principled uncertainty quantification essential for reliable deployment. To address this, we introduce a simple add-on, the diffusion last layer (DLL), a lightweight probabilistic head that can be attached to arbitrary neural operator backbones to model predictive uncertainty. Motivated by the relative smoothness and low-dimensional structure often exhibited by PDE solution distributions, DLL parameterizes the conditional output distribution directly in function space through a low-rank Karhunen-Loève expansion, enabling efficient and expressive uncertainty modeling. Across stochastic PDE operator learning benchmarks, DLL improves generalization and uncertainty-aware prediction. Moreover, even in deterministic long-horizon rollout settings, DLL enhances rollout stability and provides meaningful estimates of epistemic uncertainty for backbone neural operators.

Generative Neural Operators through Diffusion Last Layer

TL;DR

, preserving discretization invariance and enabling efficient uncertainty modeling. Across stochastic PDE benchmarks, DLL improves both distributional fidelity and calibration, while also enhancing long-horizon rollout stability in deterministic chaotic systems. The approach combines an operator encoder with a compatibility diffusion head, offering a practical, scalable path to uncertainty-aware operator learning with broad applicability to inverse problems and irregular geometries.

Abstract

Paper Structure (53 sections, 9 theorems, 64 equations, 7 figures, 9 tables)

This paper contains 53 sections, 9 theorems, 64 equations, 7 figures, 9 tables.

Introduction
Background
Operator Learning
Conditional Diffusion Models
Uncertainty Quantification and Probabilistic Surrogates
Uncertainty Quantification
Classical Probabilistic Surrogates
Conditional Diffusion Models as Probabilistic Surrogates
Diffusion Last Layer Neural Operators
Operator Encoder
Diffusion Last Layer
Experiments
Choice of Baselines
Stochastic Operator Learning
Stochastic Burgers' Equation.
...and 38 more sections

Key Result

Proposition 2.3

Fix $c$. Under suitable regularity conditions, there exists a constant $C>0$ such that

Figures (7)

Figure 1: Deterministic vs. generative neural operators. A standard neural operator (top) maps an input field $a\in\mathcal{A}$ to a single prediction $u\in\mathcal{U}$. Our Diffusion Last Layer (bottom) turns the same backbone into a conditional generator by attaching a lightweight diffusion head, producing a full predictive distribution $p(u\,|\,a)$ over functions rather than a point estimate.
Figure 2: Training and inference pipeline for the Diffusion Last Layer (DLL).(1) Operator encoder. A NO backbone produces basis functions $\Phi(a)$ and a NF maps targets to coefficients $\xi=\mathtt{NF}(u)$, yielding the low-rank reconstruction $\hat{u}=\xi^\top\Phi(a)$. (2) DLL training. With the encoder frozen, we train a conditional diffusion model in coefficient space: an MLP denoiser predicts noise (or score) from $(a,x_t,t)$ using NO features of $a$ for conditioning. (3) DLL inference. For a new input $a$, we sample coefficients $x\sim p(x\mid a)$ by iterative denoising and decode $\hat{u}=x^\top\Phi(a)$.
Figure 3: Stochastic Burgers' equation. Columns compare the ground truth and different surrogate models. For each method, the top panel shows multiple realizations of the solution field $u(x)$ for a fixed input, illustrating sample diversity. The bottom panel shows the predictive mean (solid line) and an uncertainty band given by standard deviation (shaded region) estimated from the samples.
Figure 4: Stochastic Darcy flow. For each method, we generate conditional samples of the solution field and summarize them by the sample mean (top) and per-pixel sample standard deviation (bottom). This visualization highlights both accuracy of the central prediction and the spatial structure of predictive uncertainty.
Figure 5: KS equation. Long-horizon rollout comparison at rollout step 50. The black curve denotes the ground-truth solution, while the blue curve shows the predictive mean of each method. Shaded regions indicate predictive standard deviation estimated from generated samples.
...and 2 more figures

Theorems & Definitions (18)

Proposition 2.3
Proposition 4.1: Optimal rank-$r$ reconstruction
Lemma 1.2: Differential inequality
proof
Proposition 1.3: Integrated stability bound
proof
proof : Proof of Proposition \ref{['prop:wasserstein_stability']}
Proposition 1.5: Conditional stability
proof
Lemma 1.7: Projection decomposition
...and 8 more

Generative Neural Operators through Diffusion Last Layer

TL;DR

Abstract

Generative Neural Operators through Diffusion Last Layer

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (18)