Operator Learning: A Statistical Perspective
Unique Subedi, Ambuj Tewari
TL;DR
Operator learning frames mappings between function spaces as a function-to-function regression problem to build fast surrogates for PDE solution operators and data-driven simulators. It surveys linear, neural, and RKHS-based operator classes (e.g., DeepONet, Fourier Neural Operator), data-generation strategies via Gaussian-process inputs, and multiple error sources (approximation, truncation, discretization, statistical) while highlighting PDE-informed architectures. The article outlines active data collection and uncertainty quantification as crucial future directions, arguing that leveraging PDE structure can yield data-efficient, scalable, and reliable surrogates. By bridging functional data analysis concepts with modern operator-learning architectures, the work lays a foundation for a general statistical theory and practice that scales to complex scientific problems.
Abstract
Operator learning has emerged as a powerful tool in scientific computing for approximating mappings between infinite-dimensional function spaces. A primary application of operator learning is the development of surrogate models for the solution operators of partial differential equations (PDEs). These methods can also be used to develop black-box simulators to model system behavior from experimental data, even without a known mathematical model. In this article, we begin by formalizing operator learning as a function-to-function regression problem and review some recent developments in the field. We also discuss PDE-specific operator learning, outlining strategies for incorporating physical and mathematical constraints into architecture design and training processes. Finally, we end by highlighting key future directions such as active data collection and the development of rigorous uncertainty quantification frameworks.
