Table of Contents
Fetching ...

In-Context Multi-Operator Learning with DeepOSets

Shao-Ting Chiu, Aditya Nambiar, Ali Syed, Jonathan W. Siegel, Ulisses Braga-Neto

TL;DR

The paper tackles learning solution operators for parametric PDEs across multiple equations in-context, without weight updates. It introduces DeepOSets, a non-autoregressive architecture that encodes prompt pairs with DeepSets and decodes with a DeepONet, enabling linear-time processing and automatic disambiguation of which operator to apply. A key theoretical result proves uniform universality over a compact class of continuous operators, ensuring a single model can approximate any operator in the class given enough in-prompt examples. Empirically, DeepOSets successfully handles Poisson and reaction-diffusion forward/inverse problems in-context, demonstrating fast, accurate predictions and disambiguation of unknown coefficients and boundary conditions. This work advances foundation-model-like capabilities for scientific computing by combining set-based prompt encoding with operator learning and providing a universal approximation guarantee.

Abstract

In-context Learning (ICL) is the remarkable capability displayed by some machine learning models to learn from examples in a prompt, without any further weight updates. ICL had originally been thought to emerge from the self-attention mechanism in autoregressive transformer architectures. DeepOSets is a non-autoregressive, non-attention based neural architecture that combines set learning via the DeepSets architecture with operator learning via Deep Operator Networks (DeepONets). In a previous study, DeepOSets was shown to display ICL capabilities in supervised learning problems. In this paper, we show that the DeepOSets architecture, with the appropriate modifications, is a multi-operator in-context learner that can recover the solution operator of a new PDE, not seen during training, from example pairs of parameter and solution placed in a user prompt, without any weight updates. Furthermore, we show that DeepOSets is a universal uniform approximator over a class of continuous operators, which we believe is the first result of its kind in the literature of scientific machine learning. This means that a single DeepOSets architecture exists that approximates in-context any continuous operator in the class to any fixed desired degree accuracy, given an appropriate number of examples in the prompt. Experiments with Poisson and reaction-diffusion forward and inverse boundary-value problems demonstrate the ability of the proposed model to use in-context examples to predict accurately the solutions corresponding to parameter queries for PDEs not seen during training.

In-Context Multi-Operator Learning with DeepOSets

TL;DR

The paper tackles learning solution operators for parametric PDEs across multiple equations in-context, without weight updates. It introduces DeepOSets, a non-autoregressive architecture that encodes prompt pairs with DeepSets and decodes with a DeepONet, enabling linear-time processing and automatic disambiguation of which operator to apply. A key theoretical result proves uniform universality over a compact class of continuous operators, ensuring a single model can approximate any operator in the class given enough in-prompt examples. Empirically, DeepOSets successfully handles Poisson and reaction-diffusion forward/inverse problems in-context, demonstrating fast, accurate predictions and disambiguation of unknown coefficients and boundary conditions. This work advances foundation-model-like capabilities for scientific computing by combining set-based prompt encoding with operator learning and providing a universal approximation guarantee.

Abstract

In-context Learning (ICL) is the remarkable capability displayed by some machine learning models to learn from examples in a prompt, without any further weight updates. ICL had originally been thought to emerge from the self-attention mechanism in autoregressive transformer architectures. DeepOSets is a non-autoregressive, non-attention based neural architecture that combines set learning via the DeepSets architecture with operator learning via Deep Operator Networks (DeepONets). In a previous study, DeepOSets was shown to display ICL capabilities in supervised learning problems. In this paper, we show that the DeepOSets architecture, with the appropriate modifications, is a multi-operator in-context learner that can recover the solution operator of a new PDE, not seen during training, from example pairs of parameter and solution placed in a user prompt, without any weight updates. Furthermore, we show that DeepOSets is a universal uniform approximator over a class of continuous operators, which we believe is the first result of its kind in the literature of scientific machine learning. This means that a single DeepOSets architecture exists that approximates in-context any continuous operator in the class to any fixed desired degree accuracy, given an appropriate number of examples in the prompt. Experiments with Poisson and reaction-diffusion forward and inverse boundary-value problems demonstrate the ability of the proposed model to use in-context examples to predict accurately the solutions corresponding to parameter queries for PDEs not seen during training.

Paper Structure

This paper contains 6 sections, 3 theorems, 40 equations, 2 figures.

Key Result

Lemma 4.1

Let $K$ be a compact set in a metric space, $\delta > 0$ and $C > 1$. Then there always exists a $(\delta,C)$-discretization of $K$.

Figures (2)

  • Figure 1: DeepOSets architecture for in-context multi-operator learning.
  • Figure 2: In-context PDE solution operator learning with the DeepOSets architecture. In each case, the top plot contains the parameter functions while the bottom plot contains the corresponding solution functions. The gray lines indicate in each case the parameter-solution function pairs given in the prompt. The blue line indicates the query parameter function, while the red line indicates the corresponding solution predicted by the trained DeepOSets model. The black line indicates the exact solution corresponding to the query, using the appropriate PDE. The DeepOsets model is trained to learn the four PDE problems simultaneously, with coefficients sampled randomly.

Theorems & Definitions (9)

  • Definition 4.1: Grid
  • Lemma 4.1
  • proof
  • Theorem 4.2: Uniform universality for In-Context Operator Learning
  • proof
  • proof : Proof of Lemma \ref{['lem:discretization']}
  • Lemma 6.1
  • proof
  • proof