PDE Generalization of In-Context Operator Networks: A Study on 1D Scalar Nonlinear Conservation Laws
Liu Yang, Stanley J. Osher
TL;DR
The paper addresses generalization in PDE learning by proposing in-context operator learning with ICON-LM, applied to 1D scalar nonlinear conservation laws described by $\partial_t u + \partial_x f(u)=0$. It demonstrates that a single transformer-based ICON can infer and apply forward and reverse operators from data prompts without weight updates, using a data-prompt framework based on cubic-flux conservation laws. It also shows generalization to PDEs with new flux forms by employing change-of-variables and varying-stride prompts, expanding the range of solvable problems. This work advances toward a foundation model for PDE-related tasks under the in-context operator learning framework.
Abstract
Can we build a single large model for a wide range of PDE-related scientific learning tasks? Can this model generalize to new PDEs, even of new forms, without any fine-tuning? In-context operator learning and the corresponding model In-Context Operator Networks (ICON) represent an initial exploration of these questions. The capability of ICON regarding the first question has been demonstrated previously. In this paper, we present a detailed methodology for solving PDE problems with ICON, and show how a single ICON model can make forward and reverse predictions for different equations with different strides, provided with appropriately designed data prompts. We show the positive evidence to the second question, i.e., ICON can generalize well to some PDEs with new forms without any fine-tuning. This is exemplified through a study on 1D scalar nonlinear conservation laws, a family of PDEs with temporal evolution. We also show how to broaden the range of problems that an ICON model can address, by transforming functions and equations to ICON's capability scope. We believe that the progress in this paper is a significant step towards the goal of training a foundation model for PDE-related tasks under the in-context operator learning framework.
