Table of Contents
Fetching ...

Sobolev Training for Physics Informed Neural Networks

Hwijae Son, Jin Woo Jang, Woo Jin Han, Hyung Ju Hwang

TL;DR

The paper introduces Sobolev-PINNs, a Sobolev-norm based loss framework for Physics Informed Neural Networks to accelerate PDE solving. By incorporating derivative information, it provides convergence guarantees in Sobolev spaces ($L^2$, $H^1$, $H^2$) for the heat, Burgers, Fokker–Planck, and high-dimensional Poisson equations and demonstrates substantial empirical speedups over traditional $L^2$ losses. Theoretical results are complemented by extensive experiments showing faster convergence, reduced epoch counts, and robustness across learning rates, especially with iterative sampling in high dimensions. This approach offers a principled route to more efficient and accurate neural solvers for a broad class of PDEs.

Abstract

Physics Informed Neural Networks (PINNs) is a promising application of deep learning. The smooth architecture of a fully connected neural network is appropriate for finding the solutions of PDEs; the corresponding loss function can also be intuitively designed and guarantees the convergence for various kinds of PDEs. However, the rate of convergence has been considered as a weakness of this approach. This paper proposes Sobolev-PINNs, a novel loss function for the training of PINNs, making the training substantially efficient. Inspired by the recent studies that incorporate derivative information for the training of neural networks, we develop a loss function that guides a neural network to reduce the error in the corresponding Sobolev space. Surprisingly, a simple modification of the loss function can make the training process similar to \textit{Sobolev Training} although PINNs is not a fully supervised learning task. We provide several theoretical justifications that the proposed loss functions upper bound the error in the corresponding Sobolev spaces for the viscous Burgers equation and the kinetic Fokker--Planck equation. We also present several simulation results, which show that compared with the traditional $L^2$ loss function, the proposed loss function guides the neural network to a significantly faster convergence. Moreover, we provide the empirical evidence that shows that the proposed loss function, together with the iterative sampling techniques, performs better in solving high dimensional PDEs.

Sobolev Training for Physics Informed Neural Networks

TL;DR

The paper introduces Sobolev-PINNs, a Sobolev-norm based loss framework for Physics Informed Neural Networks to accelerate PDE solving. By incorporating derivative information, it provides convergence guarantees in Sobolev spaces (, , ) for the heat, Burgers, Fokker–Planck, and high-dimensional Poisson equations and demonstrates substantial empirical speedups over traditional losses. Theoretical results are complemented by extensive experiments showing faster convergence, reduced epoch counts, and robustness across learning rates, especially with iterative sampling in high dimensions. This approach offers a principled route to more efficient and accurate neural solvers for a broad class of PDEs.

Abstract

Physics Informed Neural Networks (PINNs) is a promising application of deep learning. The smooth architecture of a fully connected neural network is appropriate for finding the solutions of PDEs; the corresponding loss function can also be intuitively designed and guarantees the convergence for various kinds of PDEs. However, the rate of convergence has been considered as a weakness of this approach. This paper proposes Sobolev-PINNs, a novel loss function for the training of PINNs, making the training substantially efficient. Inspired by the recent studies that incorporate derivative information for the training of neural networks, we develop a loss function that guides a neural network to reduce the error in the corresponding Sobolev space. Surprisingly, a simple modification of the loss function can make the training process similar to \textit{Sobolev Training} although PINNs is not a fully supervised learning task. We provide several theoretical justifications that the proposed loss functions upper bound the error in the corresponding Sobolev spaces for the viscous Burgers equation and the kinetic Fokker--Planck equation. We also present several simulation results, which show that compared with the traditional loss function, the proposed loss function guides the neural network to a significantly faster convergence. Moreover, we provide the empirical evidence that shows that the proposed loss function, together with the iterative sampling techniques, performs better in solving high dimensional PDEs.

Paper Structure

This paper contains 20 sections, 9 theorems, 88 equations, 7 figures, 1 table.

Key Result

Theorem 4.1

\newlabeltheorem_HB0 For the following 1-D heat and Burgers' equations: there hold, provided that $u_{nn}$ is smooth,

Figures (7)

  • Figure 1: First row: results for $\sin(x)$, Second row: results for $ReLU(x)$. First column: Histograms generated from the repeated training of neural networks for training $\sin(x)$, and $ReLU(x)$. Second column: Test $L^2$ errors. Third column: Average training time for each loss function to achieve certain error threshold. Error bars are for standard deviations. The thresholds for the error are set to $10^{-4}$.
  • Figure 2: Average number of epochs to make error less than $10^{-3}$ increases in L2 loss as k increases. However, when we use H1, and H2 losses, required number of epochs increases much more slowly or stays the same as k increases.
  • Figure 3: First row: results for the heat equation. Second row: results for Burgers' equation. First column: Histograms for the heat and Burgers' equation generated from a hundred neural networks for each loss function. Second column: Test $L^{\infty}(0,T;L^2(\Omega))$ errors. Third column: Average training time for each loss function to achieve certain error threshold. Error bars are for standard deviations. The thresholds for the error are set to $10^{-5}$.
  • Figure 4: First row: results for $f_1$ initial condition. Second row: results for $f_2$ initial condition. First column: Histograms generated from a hundred neural networks for each loss function. Second column: Test $L^{\infty}(0,T;L^2(\Omega))$ errors. Third column: Average training time for each loss function to achieve certain error threshold. Error bars are for standard deviations. The thresholds for the errors for the initial conditions $f_1(x,v) \text{, and } f_2(x,v)$ are set to $10^{-4}$, and $10^{-3}$, respectively.
  • Figure 5: Left column: Test errors as training goes for different values of $k$. Right column: Required Training Time to achieve a certain test error.
  • ...and 2 more figures

Theorems & Definitions (21)

  • Remark 3.1
  • Theorem 4.1
  • Proof 1
  • Theorem 4.2
  • Proof 2
  • Remark 4.3
  • Remark 4.4
  • Remark 4.5
  • Theorem 7.1: Theorem 7.1.5 in evans10
  • Remark 7.2
  • ...and 11 more