Table of Contents
Fetching ...

One-shot learning for solution operators of partial differential equations

Anran Jiao, Haiyang He, Rishikesh Ranade, Jay Pathak, Lu Lu

TL;DR

This work tackles the data efficiency challenge in learning PDE solution operators by introducing a one-shot method that exploits locality through a local operator $\tilde{\mathcal{G}}$ defined on a small domain $\tilde{\Omega}$. The approach trains $\tilde{\mathcal{G}}$ from a single PDE solution and predicts the global solution using either mesh-based fixed-point iteration (FPI) or meshfree neural networks (LOINN, cLOINN), with variants that use grid or random sampling of points. Demonstrations across 1D and 2D problems, including linear and nonlinear diffusion, advection, and diffusion-reaction systems in complex geometries, show competitive accuracy and strong generalization to new forcing terms $f$ and parameter settings, while markedly reducing data requirements. This framework advances PDE operator learning by enabling fast, data-efficient predictions and opens avenues for integrating graph-based models to handle more complex domains and boundary conditions.

Abstract

Learning and solving governing equations of a physical system, represented by partial differential equations (PDEs), from data is a central challenge in a variety of areas of science and engineering. Traditional numerical methods for solving PDEs can be computationally expensive for complex systems and require the complete PDEs of the physical system. On the other hand, current data-driven machine learning methods require a large amount of data to learn a surrogate model of the PDE solution operator, which could be impractical. Here, we propose the first solution operator learning method that only requires one PDE solution, i.e., one-shot learning. By leveraging the principle of locality of PDEs, we consider small local domains instead of the entire computational domain and define a local solution operator. The local solution operator is then trained using a neural network, and utilized to predict the solution of a new input function via mesh-based fixed-point iteration (FPI), meshfree local-solution-operator informed neural network (LOINN) or local-solution-operator informed neural network with correction (cLOINN). We test our method on diverse PDEs, including linear or nonlinear PDEs, PDEs defined on complex geometries, and PDE systems, demonstrating the effectiveness and generalization capabilities of our method across these varied scenarios.

One-shot learning for solution operators of partial differential equations

TL;DR

This work tackles the data efficiency challenge in learning PDE solution operators by introducing a one-shot method that exploits locality through a local operator defined on a small domain . The approach trains from a single PDE solution and predicts the global solution using either mesh-based fixed-point iteration (FPI) or meshfree neural networks (LOINN, cLOINN), with variants that use grid or random sampling of points. Demonstrations across 1D and 2D problems, including linear and nonlinear diffusion, advection, and diffusion-reaction systems in complex geometries, show competitive accuracy and strong generalization to new forcing terms and parameter settings, while markedly reducing data requirements. This framework advances PDE operator learning by enabling fast, data-efficient predictions and opens avenues for integrating graph-based models to handle more complex domains and boundary conditions.

Abstract

Learning and solving governing equations of a physical system, represented by partial differential equations (PDEs), from data is a central challenge in a variety of areas of science and engineering. Traditional numerical methods for solving PDEs can be computationally expensive for complex systems and require the complete PDEs of the physical system. On the other hand, current data-driven machine learning methods require a large amount of data to learn a surrogate model of the PDE solution operator, which could be impractical. Here, we propose the first solution operator learning method that only requires one PDE solution, i.e., one-shot learning. By leveraging the principle of locality of PDEs, we consider small local domains instead of the entire computational domain and define a local solution operator. The local solution operator is then trained using a neural network, and utilized to predict the solution of a new input function via mesh-based fixed-point iteration (FPI), meshfree local-solution-operator informed neural network (LOINN) or local-solution-operator informed neural network with correction (cLOINN). We test our method on diverse PDEs, including linear or nonlinear PDEs, PDEs defined on complex geometries, and PDE systems, demonstrating the effectiveness and generalization capabilities of our method across these varied scenarios.

Paper Structure

This paper contains 22 sections, 30 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: Workflow of the one-shot learning method for solution operators. (Step 1) We select a suitable polygon, such as a rectangle, on a local mesh with step size $\Delta x_1$ and $\Delta x_2$, and thus define a local domain $\tilde{\Omega}$ (the black nodes). (Step 2) We select a target mesh node $\mathbf{x}^*$ and define a local solution operator $\tilde{\mathcal{G}}$. (Step 3) We learn $\tilde{\mathcal{G}}$ using a neural network from a dataset constructed from $\mathcal{T} = (f_\mathcal{T}, u_\mathcal{T})$. (Step 4) For a new PDE condition (i.e., a new input function $f$), we utilize the pre-trained $\tilde{\mathcal{G}}$ to find the corresponding PDE solution by using one of the following approaches. (Approach 1, FPI) We consider points on an equispaced global mesh. Starting with an initial guess $u_0(\mathbf{x})$, we apply $\tilde{\mathcal{G}}$ iteratively to update the PDE solution until it is converged. (Approach 2, LOINN) We use a network to approximate the PDE solution. We apply $\tilde{\mathcal{G}}$ at different random locations to compute the loss function. (Approach 3, cLOINN) We use a network to approximate the difference between the PDE solution and the given $u_0(\mathbf{x})$.
  • Figure 2: Selecting local domains and learning local solution operators. (A) A general choice of the local domain $\tilde{\Omega}$ is a set of nodes on a polygon. $\tilde{\Omega}$ can have different shapes and sizes according to a specific PDE. A local solution operator $\tilde{\mathcal{G}}$ is defined on $\tilde{\Omega}$. (B to F) Examples of local domains and local solution operators used in this paper. (G and H) When learning $\tilde{\mathcal{G}}$, the training points based on $\tilde{\Omega}$ are either (G) selected on a global mesh or (H) randomly sampled.
  • Figure 3: 1D Poisson equation in Section \ref{['sec: Poisson1D']}. (A) The training data includes a random $f_{\mathcal{T}}$ generated from GRF and the corresponding solution $u_{\mathcal{T}}$. (B) Testing examples of random $f = f_0 + \Delta f$ with $\Delta f$ sampled from GRF of $\sigma_{\text{test}} = 0.02, 0.05, 0.10, 0.15$ and $l = 0.1$. (C) The convergence of $L^2$ relative errors of different approaches for various test cases. (D) Prediction example of different approaches for various test cases.
  • Figure 4: Learning the linear diffusion equation in Section \ref{['sec: LinearDiffusion']}. (A) The training data includes a random $f_{\mathcal{T}}$ generated from GRF and the corresponding solution $u_{\mathcal{T}}$. (B) The convergence of $L^2$ relative errors of different approaches for various test cases. (C) Prediction example of different approaches for a test case with $\sigma_{\text{test}} = 0.1$.
  • Figure 5: Learning the nonlinear diffusion-reaction equation in Section \ref{['sec: NonlinearDiffusion']}. (A) The training data includes a random $f_{\mathcal{T}}$ generated from GRF and the corresponding solution $u_{\mathcal{T}}$. (B) The convergence of $L^2$ relative errors of different approaches for various test cases. (C) Prediction example of different approaches for a test case with $\sigma_{\text{test}} = 0.1$. (D) $L^2$ relative error of different test functions with $\sigma_{\text{test}} = 0.1, 0.3, 0.5, 0.8$ when using different number of point locations to show effect of mesh resolutions.
  • ...and 5 more figures