Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation
Jingmin Sun, Yuxuan Liu, Zecheng Zhang, Hayden Schaeffer
TL;DR
This paper presents PROSE-PDE, a bi-modal transformer-based foundation model for one-dimensional time-dependent PDEs that can predict future states while simultaneously learning the governing equations. By employing symbolic encoding via Polish notation and a five-component architecture (Data/Symbol Encoders, Feature Fusion, Data/Symbol Decoders), the model supports multi-operator learning and cross-modal information exchange through cross-attention, enabling extrapolation to unseen systems without fine-tuning. The authors demonstrate EPF-type generalization across shocks and rarefactions, unseen operators, and varying input conditions, along with ablations showing the essential role of the symbolic modality. These advances suggest a path toward general-purpose PDE solvers that can generalize across models, parameters, and physical regimes with limited task-specific tuning, aiding applications in physics, geology, and biology.
Abstract
Foundation models, such as large language models, have demonstrated success in addressing various language and image processing tasks. In this work, we introduce a multi-modal foundation model for scientific problems, named PROSE-PDE. Our model, designed for bi-modality to bi-modality learning, is a multi-operator learning approach which can predict future states of spatiotemporal systems while concurrently learning the underlying governing equations of the physical system. Specifically, we focus on multi-operator learning by training distinct one-dimensional time-dependent nonlinear constant coefficient partial differential equations, with potential applications to many physical applications including physics, geology, and biology. More importantly, we provide three extrapolation studies to demonstrate that PROSE-PDE can generalize physical features through the robust training of multiple operators and that the proposed model can extrapolate to predict PDE solutions whose models or data were unseen during the training. Furthermore, we show through systematic numerical experiments that the utilization of the symbolic modality in our model effectively resolves the well-posedness problems with training multiple operators and thus enhances our model's predictive capabilities.
