Table of Contents
Fetching ...

Pretraining a Neural Operator in Lower Dimensions

AmirPouya Hemmasian, Amir Barati Farimani

TL;DR

This work aims to Pretrain neural PDE solvers on Lower Dimensional PDEs (PreLowD) where data collection is the least expensive, and uses the Factorized Fourier Neural Operator (FFNO) due to having the necessary flexibility to be applied to PDE data of arbitrary spatial dimensions.

Abstract

There has recently been increasing attention towards developing foundational neural Partial Differential Equation (PDE) solvers and neural operators through large-scale pretraining. However, unlike vision and language models that make use of abundant and inexpensive (unlabeled) data for pretraining, these neural solvers usually rely on simulated PDE data, which can be costly to obtain, especially for high-dimensional PDEs. In this work, we aim to Pretrain neural PDE solvers on Lower Dimensional PDEs (PreLowD) where data collection is the least expensive. We evaluated the effectiveness of this pretraining strategy in similar PDEs in higher dimensions. We use the Factorized Fourier Neural Operator (FFNO) due to having the necessary flexibility to be applied to PDE data of arbitrary spatial dimensions and reuse trained parameters in lower dimensions. In addition, our work sheds light on the effect of the fine-tuning configuration to make the most of this pretraining strategy. Code is available at https://github.com/BaratiLab/PreLowD.

Pretraining a Neural Operator in Lower Dimensions

TL;DR

This work aims to Pretrain neural PDE solvers on Lower Dimensional PDEs (PreLowD) where data collection is the least expensive, and uses the Factorized Fourier Neural Operator (FFNO) due to having the necessary flexibility to be applied to PDE data of arbitrary spatial dimensions.

Abstract

There has recently been increasing attention towards developing foundational neural Partial Differential Equation (PDE) solvers and neural operators through large-scale pretraining. However, unlike vision and language models that make use of abundant and inexpensive (unlabeled) data for pretraining, these neural solvers usually rely on simulated PDE data, which can be costly to obtain, especially for high-dimensional PDEs. In this work, we aim to Pretrain neural PDE solvers on Lower Dimensional PDEs (PreLowD) where data collection is the least expensive. We evaluated the effectiveness of this pretraining strategy in similar PDEs in higher dimensions. We use the Factorized Fourier Neural Operator (FFNO) due to having the necessary flexibility to be applied to PDE data of arbitrary spatial dimensions and reuse trained parameters in lower dimensions. In addition, our work sheds light on the effect of the fine-tuning configuration to make the most of this pretraining strategy. Code is available at https://github.com/BaratiLab/PreLowD.
Paper Structure (12 sections, 9 equations, 3 figures, 7 tables, 1 algorithm)

This paper contains 12 sections, 9 equations, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: a) General schematic of a neural operator. b) The nonlinear operator layer in FNO and FFNO. c) The kernel integral operator in FNO. $LPF$ stands for a low pass filter that keeps the first few Fourier modes in each axis and discards the higher frequency modes. d) 1D and 2D factorized kernel integral operator in FFNO. The red arrows show the possible transfer of pretrained parameters from a 1D model to a 2D model if $M_x=M_y$.
  • Figure 2: Average $rL_2$ loss in percentage for the diffusion equation. C0 is the randomly initialized model and the rest are PreLowDed models fine-tuned according to table \ref{['table:tune-configs']}. The left column shows the next-step error (rollout=1) and the right column shows the average error over the next five autoregressive steps (rollout=5).
  • Figure 3: Average $rL_2$ loss in percentage for the advection equation. C0 is the randomly initialized model and the rest are PreLowDed models fine-tuned according to table \ref{['table:tune-configs']}. The left column shows the next-step error (rollout=1) and the right column shows the average error over the next five autoregressive steps (rollout=5).