Table of Contents
Fetching ...

Kolmogorov n-Widths for Multitask Physics-Informed Machine Learning (PIML) Methods: Towards Robust Metrics

Michael Penwarden, Houman Owhadi, Robert M. Kirby

TL;DR

This work judiciously applies Kolmogorov n-widths as a measure of effectiveness of approximating functions, and incorporates this metric into the optimization process through regularization, which improves the models' generalizability over the multitask PDE problem.

Abstract

Physics-informed machine learning (PIML) as a means of solving partial differential equations (PDE) has garnered much attention in the Computational Science and Engineering (CS&E) world. This topic encompasses a broad array of methods and models aimed at solving a single or a collection of PDE problems, called multitask learning. PIML is characterized by the incorporation of physical laws into the training process of machine learning models in lieu of large data when solving PDE problems. Despite the overall success of this collection of methods, it remains incredibly difficult to analyze, benchmark, and generally compare one approach to another. Using Kolmogorov n-widths as a measure of effectiveness of approximating functions, we judiciously apply this metric in the comparison of various multitask PIML architectures. We compute lower accuracy bounds and analyze the model's learned basis functions on various PDE problems. This is the first objective metric for comparing multitask PIML architectures and helps remove uncertainty in model validation from selective sampling and overfitting. We also identify avenues of improvement for model architectures, such as the choice of activation function, which can drastically affect model generalization to "worst-case" scenarios, which is not observed when reporting task-specific errors. We also incorporate this metric into the optimization process through regularization, which improves the models' generalizability over the multitask PDE problem.

Kolmogorov n-Widths for Multitask Physics-Informed Machine Learning (PIML) Methods: Towards Robust Metrics

TL;DR

This work judiciously applies Kolmogorov n-widths as a measure of effectiveness of approximating functions, and incorporates this metric into the optimization process through regularization, which improves the models' generalizability over the multitask PDE problem.

Abstract

Physics-informed machine learning (PIML) as a means of solving partial differential equations (PDE) has garnered much attention in the Computational Science and Engineering (CS&E) world. This topic encompasses a broad array of methods and models aimed at solving a single or a collection of PDE problems, called multitask learning. PIML is characterized by the incorporation of physical laws into the training process of machine learning models in lieu of large data when solving PDE problems. Despite the overall success of this collection of methods, it remains incredibly difficult to analyze, benchmark, and generally compare one approach to another. Using Kolmogorov n-widths as a measure of effectiveness of approximating functions, we judiciously apply this metric in the comparison of various multitask PIML architectures. We compute lower accuracy bounds and analyze the model's learned basis functions on various PDE problems. This is the first objective metric for comparing multitask PIML architectures and helps remove uncertainty in model validation from selective sampling and overfitting. We also identify avenues of improvement for model architectures, such as the choice of activation function, which can drastically affect model generalization to "worst-case" scenarios, which is not observed when reporting task-specific errors. We also incorporate this metric into the optimization process through regularization, which improves the models' generalizability over the multitask PDE problem.
Paper Structure (17 sections, 20 equations, 18 figures, 3 tables, 2 algorithms)

This paper contains 17 sections, 20 equations, 18 figures, 3 tables, 2 algorithms.

Figures (18)

  • Figure 1: Physics-Informed Machine Learning as a spectrum of data and physics. Residual points are where the PDE residual is evaluated and minimized during optimization.
  • Figure 2: (A) Diagram of full Physics-Informed Neural Network including the PDE residual formulation using automatic differentiation and optimization process. (B) PINN solution as a sum of basis functions ($\phi_i$) and coefficients ($c_i$).
  • Figure 3: (A) Multihead PINN where each "head" is a different linear combination ($c_i$) of the body network basis functions ($\phi_i$). (B) Physics-Informed DeepONet architecture where the "branch" network represents the coefficients ($c_i$) and the "trunk" network represents the basis functions ($\phi_i$).
  • Figure 4: (A) Distance of $\mathcal{M}_n$ to point $x \in \mathcal{M}$ where $y_n^*$ is the best approximation, i.e., satisfies Equation \ref{['eq:distance']}. (B) Distance of $\mathcal{M}_n$ to $\mathcal{A}$ where $y_n^*$ and $x^*$ achieve the sup inf of $\mathcal{M}_n$ to $\mathcal{A}$, i.e., satisfy $\sup_{x \in \mathcal{A}} \inf_{y_n \in \mathcal{M}_n} ||x-y_n||_{\mathcal{M}}$. (C) Distance of $\mathcal{M}^*_n$ to $\mathcal{A}$ where $y_n^*$ and $x^*$ achieve the inf sup inf of $\mathcal{M}^*_n$ to $\mathcal{A}$, i.e., satisfy $\inf_{\mathcal{M}_n \in \mathcal{M}} \sup_{x \in \mathcal{A}} \inf_{y_n \in \mathcal{M}_n} ||x-y_n||_{\mathcal{M}}$.
  • Figure 5: Multistep optimization process used to incorporate L-BFGS training into our Kolmogorov n-width regularization scheme. If L-BFGS training is not necessary, Algorithm \ref{['alg:regularization']} can be used as is, e.g., with Adam optimizer only. Model optimization (step 3) is architecture dependant and explicitly defined in Equation \ref{['eq:MHPINN']} for MH-PINNs and Equation \ref{['eq:PIDON']} for PI-DON. The only difference here is that we are regularizing with the Kolmogorov n-width term, for which the "most challenging" problem in the solution manifold has been estimated in the first two steps, and results in the term in Equation \ref{['eq:model_optimization_with_regularization']}.
  • ...and 13 more figures

Theorems & Definitions (6)

  • Definition 1
  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Remark 5