Table of Contents
Fetching ...

Optimal time sampling in physics-informed neural networks

Gabriel Turinici

TL;DR

This work analyzes how temporal sampling in physics-informed neural networks (PINNs) should be weighted to minimize the final-time error under a finite computational budget. By examining a linear ODE surrogate and leveraging convergence properties, it proves that the optimal time-weighting is a truncated exponential distribution with rate $4\lambda/3$, where $\lambda$ is a Lyapunov-like exponent, and shows the optimal instantaneous error profile $w(t)\propto e^{-\lambda (T-t)/3}$. Numerical experiments on a linear ODE, Burgers' equation, and the Lorenz system validate the theory: chaotic or highly sensitive regimes favor heavier weighting of early times, while stable or parabolic cases benefit less from such weighting. The results provide a principled basis for time-sampling strategies in PINNs and highlight the practical need to estimate the appropriate rate parameter, potentially via hyperparameter optimization or adaptive schemes. Overall, the paper connects dynamical-system concepts to PINN training efficiency and offers actionable guidance for improving accuracy under limited compute.

Abstract

Physics-informed neural networks (PINN) is a extremely powerful paradigm used to solve equations encountered in scientific computing applications. An important part of the procedure is the minimization of the equation residual which includes, when the equation is time-dependent, a time sampling. It was argued in the literature that the sampling need not be uniform but should overweight initial time instants, but no rigorous explanation was provided for this choice. In the present work we take some prototypical examples and, under standard hypothesis concerning the neural network convergence, we show that the optimal time sampling follows a (truncated) exponential distribution. In particular we explain when is best to use uniform time sampling and when one should not. The findings are illustrated with numerical examples on linear equation, Burgers' equation and the Lorenz system.

Optimal time sampling in physics-informed neural networks

TL;DR

This work analyzes how temporal sampling in physics-informed neural networks (PINNs) should be weighted to minimize the final-time error under a finite computational budget. By examining a linear ODE surrogate and leveraging convergence properties, it proves that the optimal time-weighting is a truncated exponential distribution with rate , where is a Lyapunov-like exponent, and shows the optimal instantaneous error profile . Numerical experiments on a linear ODE, Burgers' equation, and the Lorenz system validate the theory: chaotic or highly sensitive regimes favor heavier weighting of early times, while stable or parabolic cases benefit less from such weighting. The results provide a principled basis for time-sampling strategies in PINNs and highlight the practical need to estimate the appropriate rate parameter, potentially via hyperparameter optimization or adaptive schemes. Overall, the paper connects dynamical-system concepts to PINN training efficiency and offers actionable guidance for improving accuracy under limited compute.

Abstract

Physics-informed neural networks (PINN) is a extremely powerful paradigm used to solve equations encountered in scientific computing applications. An important part of the procedure is the minimization of the equation residual which includes, when the equation is time-dependent, a time sampling. It was argued in the literature that the sampling need not be uniform but should overweight initial time instants, but no rigorous explanation was provided for this choice. In the present work we take some prototypical examples and, under standard hypothesis concerning the neural network convergence, we show that the optimal time sampling follows a (truncated) exponential distribution. In particular we explain when is best to use uniform time sampling and when one should not. The findings are illustrated with numerical examples on linear equation, Burgers' equation and the Lorenz system.
Paper Structure (17 sections, 3 theorems, 22 equations, 7 figures)

This paper contains 17 sections, 3 theorems, 22 equations, 7 figures.

Key Result

proposition thmcounterproposition

Denote and assume that eq:error_definition holds true. Then under hypothesis (H Opt) the error $|\mathcal{U}_\theta(T)- u(T)|$ is at least equal to with equality when $w(t)$ is proportional to $e^{-\lambda(T-t)/3}$.

Figures (7)

  • Figure 1: An illustration of network $\mathcal{U}_\theta$. It takes as input a time $t$ and a space value $x$ and outputs the solution candidate $\mathcal{U}_\theta(t,x)$ for this input couple. The NN is trained so that $\mathcal{U}_\theta(t,x)$ is close to the solution $u$ of \ref{['eq:general_equation']}.
  • Figure 2: Test of the model expressiveness (results for $\lambda=2.$). The model solution is graphically indistinguishable from the exact solution meaning that the NN is complex enough to reproduce the shape of the solution.
  • Figure 3: Results for $\lambda=2.$ and sampling parameter $r=2.0$. First two plots: the results for $500$ epochs. Last two plots: results for $1500$ epochs.
  • Figure 4: Sampling law influence for $\lambda=2$ (from left to right, top to bottom, plots 1 and 2), $\lambda=-2$ (plots 3 and 4) and $\lambda=0$ (plots 5 and 6). All sampling are done with law $\mathcal{E}^{0,T,r}$. Plots 1, 3 and 5 : the final error as a function of $r$. Plots 2,4 and 6 : the loss.
  • Figure 5: Burgers' equation sampling parameter $r=0$ i.e., uniform law $\mathcal{E}^{0,T,0}$. Left plot: the solution at different times. Right plot: the comparison with a finite difference solution considered exact.
  • ...and 2 more figures

Theorems & Definitions (5)

  • proposition thmcounterproposition
  • remark thmcounterremark
  • proposition thmcounterproposition
  • remark thmcounterremark
  • proposition thmcounterproposition