PINNverse: Accurate parameter estimation in differential equations from noisy data with constrained physics-informed neural networks

Marius Almanstötter; Roman Vetter; Dagmar Iber

PINNverse: Accurate parameter estimation in differential equations from noisy data with constrained physics-informed neural networks

Marius Almanstötter, Roman Vetter, Dagmar Iber

TL;DR

PINNverse is introduced, a training paradigm that addresses convergence issues, stability problems, overfitting, and complex loss function design by reformulating the learning process as a constrained differential optimization problem and enables accurate parameter inference when the forward problem is expensive to solve.

Abstract

Parameter estimation for differential equations from measured data is an inverse problem prevalent across quantitative sciences. Physics-Informed Neural Networks (PINNs) have emerged as effective tools for solving such problems, especially with sparse measurements and incomplete system information. However, PINNs face convergence issues, stability problems, overfitting, and complex loss function design. Here we introduce PINNverse, a training paradigm that addresses these limitations by reformulating the learning process as a constrained differential optimization problem. This approach achieves a dynamic balance between data loss and differential equation residual loss during training while preventing overfitting. PINNverse combines the advantages of PINNs with the Modified Differential Method of Multipliers to enable convergence on any point on the Pareto front. We demonstrate robust and accurate parameter estimation from noisy data in four classical ODE and PDE models from physics and biology. Our method enables accurate parameter inference also when the forward problem is expensive to solve.

PINNverse: Accurate parameter estimation in differential equations from noisy data with constrained physics-informed neural networks

TL;DR

Abstract

Paper Structure

This paper contains 16 sections, 31 equations, 5 figures.

Figures (5)

Figure 1: Schematic representation of the difference between PINN and PINNverse. When approximating solutions to differential equations (DEs) in residual form $\mathcal{F}=0$ with a neural network (NN), the architecture utilizes hidden layers (parameter set $\bm{\theta}$) to map input variables including spatial coordinates $\bm{x}$ and time $t$ to the solution space of state variables, represented by a $k$-dimensional vector of functions $\bm{u}^{\bm{\theta}}$ (orange box). The training process of a PINN (purple box) minimizes a composite loss function incorporating terms penalizing deviations of the predicted solution $\bm{u}^{\bm{\theta}}$ from observed noisy data points $\bm{u}^\textrm{data}_i$ (blue dots) ($L_{\textrm{data}}$), as well as terms that penalize violations of the differential equations ($L_\mathrm{de}$) and of the initial and/or boundary conditions ($L_{\textrm{ic}}$, $L_{\textrm{bc}}$). The predicted NN solution typically suffers from overfitting and misses non-convex parts (solid red line) of the Pareto front (solid black line). Examples of trajectories starting from initial points within the feasible region (dashed area) and leading to this front are schematically visualized for a 2D subspace. With PINNverse (green box), the data, IC and BC losses are formulated as external constraints under which the optimization is carried out, which avoids overfitting and allows the trajectories to converge also on convex parts of the front.
Figure 2: Parameter estimation performance in the kinetic reaction ODE model.a, Heatmaps depicting performance metrics across varying noise levels in the data, $\zeta$, and deviations in initial parameter guesses, $\xi$ (Methods). The black square highlights the scenario $\zeta=25\%$, $\xi=75\%$ analyzed in detail in subsequent panels. b, Comparison of trajectories for species $[C](t)$, generated using estimated parameters (green curve), true parameters (yellow curve), neural network predictions (blue curve) and the corresponding noisy observational data (brown dots). c, Training loss evolution for PINNverse and conventional PINN. Data, differential equation (DE) and initial condition (IC) losses are depicted. For PINNverse, a power law was fitted to the DE and IC losses after 1000 epochs (shifted dashed lines) with indicated exponents.
Figure 3: Parameter estimation performance in the FitzHugh--Nagumo ODE model.a, Heatmaps depicting performance metrics across varying noise levels in the data, $\zeta$, and deviations in initial parameter guesses, $\xi$ (Methods). The black square highlights the scenario $\zeta=25\%$, $\xi=500\%$ analyzed in detail in subsequent panels. b, Comparison of trajectories for the excitable variable $u(t)$, generated using estimated parameters (green curve), true parameters (yellow curve), neural network predictions (blue curve), and the corresponding noisy observational data (brown dots). c, Training loss evolution for PINNverse and conventional PINN. Data, differential equation (DE) and initial condition (IC) losses are depicted. For PINNverse, a power law was fitted to the DE and IC losses after 1000 epochs (shifted dashed lines) with indicated exponents.
Figure 4: Parameter estimation performance in the Fisher--KPP PDE model.a, Heatmaps depicting performance metrics across varying noise levels in the data, $\zeta$, and deviations in initial parameter guesses, $\xi$ (Methods). The black square highlights the scenario $\zeta=25\%$, $\xi=75\%$ analyzed in detail in subsequent panels. b, Comparison of trajectories for the cell concentration $u(x)$ at time point $t=2$, generated using estimated parameters (green curve), true parameters (yellow curve), neural network predictions (blue curve), and the corresponding noisy observational data (brown dots). c, Training loss evolution for PINNverse and conventional PINN. Data, differential equation (DE), initial condition (IC) and boundary condition (BC) losses are depicted. For PINNverse, a power law was fitted to the DE, IC and BC losses after 1000 epochs (shifted dashed lines) with indicated exponents.
Figure 5: Parameter estimation performance in Burgers' PDE model.a, Heatmaps depicting performance metrics across varying noise levels in the data, $\zeta$, and deviations in initial parameter guesses, $\xi$ (Methods). The black square highlights the scenario $\zeta=25\%$, $\xi=75\%$ analyzed in detail in subsequent panels. b, Comparison of trajectories for the dependent variable $u(x)$ at time point $t=0.5$, generated using estimated parameters (green curve), true parameters (yellow curve), neural network predictions (blue curve), and the corresponding noisy observational data (brown dots). c, Training loss evolution for PINNverse and conventional PINN. Data, differential equation (DE), initial condition (IC) and boundary condition (BC) losses are depicted. For PINNverse, a power law was fitted after 1000 epochs to the DE, IC and BC losses (shifted dashed lines) with indicated exponents.