In-Context Multi-Operator Learning with DeepOSets
Shao-Ting Chiu, Aditya Nambiar, Ali Syed, Jonathan W. Siegel, Ulisses Braga-Neto
TL;DR
The paper tackles learning solution operators for parametric PDEs across multiple equations in-context, without weight updates. It introduces DeepOSets, a non-autoregressive architecture that encodes prompt pairs with DeepSets and decodes with a DeepONet, enabling linear-time processing and automatic disambiguation of which operator to apply. A key theoretical result proves uniform universality over a compact class of continuous operators, ensuring a single model can approximate any operator in the class given enough in-prompt examples. Empirically, DeepOSets successfully handles Poisson and reaction-diffusion forward/inverse problems in-context, demonstrating fast, accurate predictions and disambiguation of unknown coefficients and boundary conditions. This work advances foundation-model-like capabilities for scientific computing by combining set-based prompt encoding with operator learning and providing a universal approximation guarantee.
Abstract
In-context Learning (ICL) is the remarkable capability displayed by some machine learning models to learn from examples in a prompt, without any further weight updates. ICL had originally been thought to emerge from the self-attention mechanism in autoregressive transformer architectures. DeepOSets is a non-autoregressive, non-attention based neural architecture that combines set learning via the DeepSets architecture with operator learning via Deep Operator Networks (DeepONets). In a previous study, DeepOSets was shown to display ICL capabilities in supervised learning problems. In this paper, we show that the DeepOSets architecture, with the appropriate modifications, is a multi-operator in-context learner that can recover the solution operator of a new PDE, not seen during training, from example pairs of parameter and solution placed in a user prompt, without any weight updates. Furthermore, we show that DeepOSets is a universal uniform approximator over a class of continuous operators, which we believe is the first result of its kind in the literature of scientific machine learning. This means that a single DeepOSets architecture exists that approximates in-context any continuous operator in the class to any fixed desired degree accuracy, given an appropriate number of examples in the prompt. Experiments with Poisson and reaction-diffusion forward and inverse boundary-value problems demonstrate the ability of the proposed model to use in-context examples to predict accurately the solutions corresponding to parameter queries for PDEs not seen during training.
