Generating synthetic data for neural operators
Erisa Hasani, Rachel A. Ward
TL;DR
The paper tackles the data bottleneck in neural PDE solvers by introducing backward data generation, where unknown solutions $u$ from a known Sobolev space are sampled and differentiated to yield the forcing term $f$, enabling exact synthetic training pairs for neural operators. This data is used to train a Fourier Neural Operator to learn the solution operator for a class of elliptic PDEs, with experiments spanning Poisson and semilinear and linear second order PDEs in both Dirichlet and Neumann settings. Results show that models trained solely on synthetic backward data generalize well to data produced by classical solvers, outperforming forward data generation in several non trig RHS scenarios and offering significant speedups in data generation. The approach is architecture agnostic and can potentially be extended to other PDE classes and transfer learning setups, reducing dependence on numerical solvers for training data while maintaining accuracy across varied domains and operators.
Abstract
Recent advances in the literature show promising potential of deep learning methods, particularly neural operators, in obtaining numerical solutions to partial differential equations (PDEs) beyond the reach of current numerical solvers. However, existing data-driven approaches often rely on training data produced by numerical PDE solvers (e.g., finite difference or finite element methods). We introduce a "backward" data generation method that avoids solving the PDE numerically: by randomly sampling candidate solutions $u_j$ from the appropriate solution space (e.g., $H_0^1(Ω)$), we compute the corresponding right-hand side $f_j$ directly from the equation by differentiation. This produces training pairs ${(f_j, u_j)}$ by computing derivatives rather than solving a PDE numerically for each data point, enabling fast, large-scale data generation consisting of exact solutions. Experiments indicate that models trained on this synthetic data generalize well when tested on data produced by standard solvers. While the idea is simple, we hope this method will expand the potential of neural PDE solvers that do not rely on classical numerical solvers to generate their data.
