Table of Contents
Fetching ...

L-HYDRA: Multi-Head Physics-Informed Neural Networks

Zongren Zou, George Em Karniadakis

TL;DR

MH-PINNs present a two-stage framework where a shared nonlinear body $\Phi$ supports $M$ tasks via task-specific heads $H_k$, with a normalizing-flow model $\hat{p}(H)$ enabling generative modeling and uncertainty quantification. This setup supports multi-task learning, generative modeling of stochastic inputs, and few-shot physics-informed learning by regularizing or Bayesian-inferencing on downstream heads. The authors demonstrate the method on five SciML benchmarks, including forward and inverse ODEs/PDEs, showing accurate predictions and calibrated uncertainty even with limited data. They also analyze basis-function learning, initialization effects, and the trade-offs between MTL and STL, and provide the L-HYDRA open-source code.

Abstract

We introduce multi-head neural networks (MH-NNs) to physics-informed machine learning, which is a type of neural networks (NNs) with all nonlinear hidden layers as the body and multiple linear output layers as multi-head. Hence, we construct multi-head physics-informed neural networks (MH-PINNs) as a potent tool for multi-task learning (MTL), generative modeling, and few-shot learning for diverse problems in scientific machine learning (SciML). MH-PINNs connect multiple functions/tasks via a shared body as the basis functions as well as a shared distribution for the head. The former is accomplished by solving multiple tasks with MH-PINNs with each head independently corresponding to each task, while the latter by employing normalizing flows (NFs) for density estimate and generative modeling. To this end, our method is a two-stage method, and both stages can be tackled with standard deep learning tools of NNs, enabling easy implementation in practice. MH-PINNs can be used for various purposes, such as approximating stochastic processes, solving multiple tasks synergistically, providing informative prior knowledge for downstream few-shot learning tasks such as meta-learning and transfer learning, learning representative basis functions, and uncertainty quantification. We demonstrate the effectiveness of MH-PINNs in five benchmarks, investigating also the possibility of synergistic learning in regression analysis. We name the open-source code "Lernaean Hydra" (L-HYDRA), since this mythical creature possessed many heads for performing important multiple tasks, as in the proposed method.

L-HYDRA: Multi-Head Physics-Informed Neural Networks

TL;DR

MH-PINNs present a two-stage framework where a shared nonlinear body supports tasks via task-specific heads , with a normalizing-flow model enabling generative modeling and uncertainty quantification. This setup supports multi-task learning, generative modeling of stochastic inputs, and few-shot physics-informed learning by regularizing or Bayesian-inferencing on downstream heads. The authors demonstrate the method on five SciML benchmarks, including forward and inverse ODEs/PDEs, showing accurate predictions and calibrated uncertainty even with limited data. They also analyze basis-function learning, initialization effects, and the trade-offs between MTL and STL, and provide the L-HYDRA open-source code.

Abstract

We introduce multi-head neural networks (MH-NNs) to physics-informed machine learning, which is a type of neural networks (NNs) with all nonlinear hidden layers as the body and multiple linear output layers as multi-head. Hence, we construct multi-head physics-informed neural networks (MH-PINNs) as a potent tool for multi-task learning (MTL), generative modeling, and few-shot learning for diverse problems in scientific machine learning (SciML). MH-PINNs connect multiple functions/tasks via a shared body as the basis functions as well as a shared distribution for the head. The former is accomplished by solving multiple tasks with MH-PINNs with each head independently corresponding to each task, while the latter by employing normalizing flows (NFs) for density estimate and generative modeling. To this end, our method is a two-stage method, and both stages can be tackled with standard deep learning tools of NNs, enabling easy implementation in practice. MH-PINNs can be used for various purposes, such as approximating stochastic processes, solving multiple tasks synergistically, providing informative prior knowledge for downstream few-shot learning tasks such as meta-learning and transfer learning, learning representative basis functions, and uncertainty quantification. We demonstrate the effectiveness of MH-PINNs in five benchmarks, investigating also the possibility of synergistic learning in regression analysis. We name the open-source code "Lernaean Hydra" (L-HYDRA), since this mythical creature possessed many heads for performing important multiple tasks, as in the proposed method.
Paper Structure (22 sections, 15 equations, 12 figures, 9 tables)

This paper contains 22 sections, 15 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: Schematic view of the structure of multi-head physics-informed neural networks (MH-PINNs) with $M$ different heads, which are built upon conventional multi-head neural networks. The shared layers are often referred to as body and the task-specific layer as head. Generally, $u_k, k=1,...,M$ represent $M$ solutions to $M$ different ODEs/PDEs, formulated in Eq. \ref{['eq:problem']}, which may differ in source terms $f_k$, boundary/initial condition terms $b_k$, or differential operator $\mathcal{F}_k$.
  • Figure 1: Schematic view of the learning framework and the proposed method. Three general types of learning are addressed: physics-informed learning, generative modeling, and few-shot learning. The physics-informed learning is performed with MH-PINNs; the generative modeling is done afterwards by density estimate over the head via normalizing flows (NFs); in the end the few-shot physics-informed learning is accomplished with prior knowledge obtained from previous two via either fine-tuning with the learned regularization or Bayesian inference with the learned prior distribution. The body represents the set of basis functions learned from solving $\{\mathcal{T}_k\}_{k=1}^M$ with MH-PINNs, and the density of the head, estimated from its samples using NFs, acts as the regularization, the prior distribution, or the generator together with the body, depending on the usage of MH-PINNs in applications.
  • Figure 1: Results for approximating the stochastic function defined in Eq. \ref{['eq:example_1']} and solving the downstream few-shot regression tasks. (a) Left: $1,000$ samples generated from the exact distribution; middle: $1,000$ samples generated from the learned generator; right: statistics computed from samples, in which we refer to the interval of mean $\pm$ 2 standard deviations as bound. (b)/(c) Results for the downstream tasks. Left: results for noiseless cases using our method, the transfer learning (TL) method in desai2021one, and regular NN method; middle: results for noisy case using our method with HMC for posterior estimate; right: results for the same noisy case using our method with LA for posterior estimate.
  • Figure 1: The effect of different initialization methods of the head, in basis functions learning, few-shot learning, and generator learning. (a) Samples of $20$ basis functions from MH-NNs, trained for approximating $1,000$$f$ generated from Eq. \ref{['eq:example_1']}, using, from left to right, RN ($0.05$), GU and RN ($1$) initialization methods. (b) $1,000$ training samples of $f$. (c) Results for two downstream few-shot regression tasks, using TL method without regularization informed by the learned PDF, as opposite to the proposed approach. (d) Results for generator learning, using, from left to right, RN ($0.05$), GU and RN ($1$) initialization methods.
  • Figure 2: Results for regression on an out-of-distribution function. Left: few-shot regression with clean data using our approach; middle: few-shot regression with noisy data using our approach with HMC for posterior estimate; right: regression with sufficient clean data using regular NN method and our approach with different regularization terms, $\alpha$ in Eq. \ref{['eq:optimization']}.
  • ...and 7 more figures