Calibrated Uncertainty Quantification for Operator Learning via Conformal Prediction
Ziqi Ma, Kamyar Azizzadenesheli, Anima Anandkumar
TL;DR
This work tackles calibrated uncertainty quantification for operator learning, where outputs are functions and require domain-wide calibration. It introduces UQNO, a distribution-free, finite-sample framework that learns a base operator $\hat{\mathcal{G}}$ and a per-point uncertainty proxy $E(a)(x)$, then forms $C_\lambda(a)(x)$ as a per-point ball around $\hat{\mathcal{G}}(a)(x)$ and calibrates it with split conformal prediction to achieve an $(\alpha,\delta)$-risk-controlling guarantee. The approach combines a generalized quantile loss for the base operator with conformal calibration, enabling simultaneous, pointwise uncertainty across the whole domain and providing a PAC bound on calibration coverage. Empirical results on 2D Darcy flow and 3D car surface pressure demonstrate calibrated, tight uncertainty bands that outperform baselines and, in the 3D case, meet a target calibration of $98\%$ where others fail. This framework offers principled, function-space uncertainty for safety-critical PDE-informed tasks with finite data, and opens avenues for extensions to mixed discretizations and uncertainty decomposition.
Abstract
Operator learning has been increasingly adopted in scientific and engineering applications, many of which require calibrated uncertainty quantification. Since the output of operator learning is a continuous function, quantifying uncertainty simultaneously at all points in the domain is challenging. Current methods consider calibration at a single point or over one scalar function or make strong assumptions such as Gaussianity. We propose a risk-controlling quantile neural operator, a distribution-free, finite-sample functional calibration conformal prediction method. We provide a theoretical calibration guarantee on the coverage rate, defined as the expected percentage of points on the function domain whose true value lies within the predicted uncertainty ball. Empirical results on a 2D Darcy flow and a 3D car surface pressure prediction task validate our theoretical results, demonstrating calibrated coverage and efficient uncertainty bands outperforming baseline methods. In particular, on the 3D problem, our method is the only one that meets the target calibration percentage (percentage of test samples for which the uncertainty estimates are calibrated) of 98%.
