Table of Contents
Fetching ...

DeepONet as a Multi-Operator Extrapolation Model: Distributed Pretraining with Physics-Informed Fine-Tuning

Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, Hayden Schaeffer

TL;DR

This work proposes a novel fine-tuning method to achieve multi-operator learning through training a distributed neural operator with diverse function data and then zero-shot fine-tuning the neural network using physics-informed losses for downstream tasks.

Abstract

We propose a novel fine-tuning method to achieve multi-operator learning through training a distributed neural operator with diverse function data and then zero-shot fine-tuning the neural network using physics-informed losses for downstream tasks. Operator learning effectively approximates solution operators for PDEs and various PDE-related problems, yet it often struggles to generalize to new tasks. To address this, we investigate fine-tuning a pretrained model, while carefully selecting an initialization that enables rapid adaptation to new tasks with minimal data. Our approach combines distributed learning to integrate data from various operators in pre-training, while physics-informed methods enable zero-shot fine-tuning, minimizing the reliance on downstream data. We investigate standard fine-tuning and Low-Rank Adaptation fine-tuning, applying both to train complex nonlinear target operators that are difficult to learn only using random initialization. Through comprehensive numerical examples, we demonstrate the advantages of our approach, showcasing significant improvements in accuracy. Our findings provide a robust framework for advancing multi-operator learning and highlight the potential of transfer learning techniques in this domain.

DeepONet as a Multi-Operator Extrapolation Model: Distributed Pretraining with Physics-Informed Fine-Tuning

TL;DR

This work proposes a novel fine-tuning method to achieve multi-operator learning through training a distributed neural operator with diverse function data and then zero-shot fine-tuning the neural network using physics-informed losses for downstream tasks.

Abstract

We propose a novel fine-tuning method to achieve multi-operator learning through training a distributed neural operator with diverse function data and then zero-shot fine-tuning the neural network using physics-informed losses for downstream tasks. Operator learning effectively approximates solution operators for PDEs and various PDE-related problems, yet it often struggles to generalize to new tasks. To address this, we investigate fine-tuning a pretrained model, while carefully selecting an initialization that enables rapid adaptation to new tasks with minimal data. Our approach combines distributed learning to integrate data from various operators in pre-training, while physics-informed methods enable zero-shot fine-tuning, minimizing the reliance on downstream data. We investigate standard fine-tuning and Low-Rank Adaptation fine-tuning, applying both to train complex nonlinear target operators that are difficult to learn only using random initialization. Through comprehensive numerical examples, we demonstrate the advantages of our approach, showcasing significant improvements in accuracy. Our findings provide a robust framework for advancing multi-operator learning and highlight the potential of transfer learning techniques in this domain.

Paper Structure

This paper contains 21 sections, 16 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Methodology demonstration for downstream PDE $u_t - u_{xx} = f$ with initial condition (IC) and Dirchlet boundary conditions (BC). $\mathcal{AD}$ denotes the auto-differentiaiton of the modern machine learning software, $\bigotimes$ denotes the inner product.
  • Figure 2: Left: Relative error (in log scale) decay with respect to training epochs. Three full tuning curves are dashed lines, and two LoRA training curves are solid lines. The final relative errors are presented in Table \ref{['table_vburgers_results']}. Right: A demonstration of the predictions.
  • Figure 3: Demonstration for two solution operators: (1) diffusion-reaction system (\ref{['eqn_dr']}) with $g(u) = -(u-0.5)(u-1)$ and $f(x) = \cos(\pi x)$, (2) Porous media system (\ref{['eqn_nl_porous_media']}) with degree 2 and $f(x) = \frac{1}{5} \sin(2\pi x)$. Porous media solutions are utilized in constructing the MODNO/D2NO pretraining model, while the downstream task is the reaction-diffusion system. We ran the numerical experiments for 8 times with different random seeds, and the standard deviations for various experimental settings are as follows: 0.04 for RD pretrained model, 0.05 for PM pretrained model, and 0.46 for random initialization.