Table of Contents
Fetching ...

A practical identifiability criterion leveraging weak-form parameter estimation

Nora Heitzman-Breen, Vanja Dukic, David M. Bortz

TL;DR

This work addresses practical identifiability in dynamical systems by introducing $(e,q)$-identifiability, which ties observation noise level $e$ to the mean-squared error of parameter estimates via $MSE < (q w)^2$. It leverages weak-form parameter estimation (WENDy) together with differential-elimination to generate weak input-output equations for systems with unobserved variables, enabling rapid, robust identifiability analysis. The approach is demonstrated on two canonical biological models (blood-tissue diffusion and SIR), showing that (i) WENDy achieves substantial computational speedups over traditional output-error methods and maintains or improves estimator accuracy under noise, and (ii) the $(e,q)$-criterion captures identifiability under both additive and multiplicative noise, aligning with conventional CI coverage and relative-error metrics. Overall, the method provides a practical, scalable framework for a priori identifiability assessment that is robust to measurement noise and applicable to a range of systems, with potential extensions to PDEs and non-Gaussian noise.

Abstract

In this work, we define a practical identifiability criterion, (e, q)-identifiability, based on a parameter e, reflecting the noise in observed variables, and a parameter q, reflecting the mean-square error of the parameter estimator. This criterion is better able to encompass changes in the quality of the parameter estimate due to increased noise in the data (compared to existing criteria based solely on average relative errors). Furthermore, we leverage a weak-form equation error-based method of parameter estimation for systems with unobserved variables to assess practical identifiability far more quickly in comparison to output error-based parameter estimation. We do so by generating weak-form input-output equations using differential algebra techniques, as previously proposed by Boulier et al [1], and then applying Weak form Estimation of Nonlinear Dynamics (WENDy) to obtain parameter estimates. This method is computationally efficient and robust to noise, as demonstrated through two classical biological modelling examples.

A practical identifiability criterion leveraging weak-form parameter estimation

TL;DR

This work addresses practical identifiability in dynamical systems by introducing -identifiability, which ties observation noise level to the mean-squared error of parameter estimates via . It leverages weak-form parameter estimation (WENDy) together with differential-elimination to generate weak input-output equations for systems with unobserved variables, enabling rapid, robust identifiability analysis. The approach is demonstrated on two canonical biological models (blood-tissue diffusion and SIR), showing that (i) WENDy achieves substantial computational speedups over traditional output-error methods and maintains or improves estimator accuracy under noise, and (ii) the -criterion captures identifiability under both additive and multiplicative noise, aligning with conventional CI coverage and relative-error metrics. Overall, the method provides a practical, scalable framework for a priori identifiability assessment that is robust to measurement noise and applicable to a range of systems, with potential extensions to PDEs and non-Gaussian noise.

Abstract

In this work, we define a practical identifiability criterion, (e, q)-identifiability, based on a parameter e, reflecting the noise in observed variables, and a parameter q, reflecting the mean-square error of the parameter estimator. This criterion is better able to encompass changes in the quality of the parameter estimate due to increased noise in the data (compared to existing criteria based solely on average relative errors). Furthermore, we leverage a weak-form equation error-based method of parameter estimation for systems with unobserved variables to assess practical identifiability far more quickly in comparison to output error-based parameter estimation. We do so by generating weak-form input-output equations using differential algebra techniques, as previously proposed by Boulier et al [1], and then applying Weak form Estimation of Nonlinear Dynamics (WENDy) to obtain parameter estimates. This method is computationally efficient and robust to noise, as demonstrated through two classical biological modelling examples.

Paper Structure

This paper contains 21 sections, 41 equations, 16 figures, 3 tables.

Figures (16)

  • Figure 1: Using WENDy with equation \ref{['eq:weak_bld_cnc']} we recover parameters for the blood diffusion model \ref{['eq:bld_cnc']} with only observations in the blood compartment ($x_1(t)$). The drug concentration dynamics in the blood compartment (red) and the tissue compartment (blue) are given for model \ref{['eq:bld_cnc']} with parameters $k_{12}=5$, $k_{21}=1$, and $V_e=6$. These parameters are chosen to allow comparison to the example in BoulierKorporalLemaireEtAl2014ComputerAlgebrainScientificComputing. The plot depicts the relative error in estimating the true parameters using WENDy from 400 observations of the blood compartment (black dots) with $e=5\%$additive observation error ratio. Note that $e=5\%$additive observation error ratio is equivalent to the white noise of $\sigma=25.06\%$ for this choice of model parameters.
  • Figure 2: For using WENDy to estimate the parameters in \ref{['eq:weak_bld_cnc']}, this figure depicts the relative parameter error vs. observation error ratio $e$. The different curves represent using WENDy with different test functions ($C^\infty$ from \ref{['eq:C_inf']}, Hartley from \ref{['eq:Hartley']}, and Polynomial from \ref{['eq:poly']}). When using either $C^\infty$ or polynomial test functions (as is standard in WENDy), the relative error remains below $25\%$ for observations with up to a $10\%$ additive error ratio, and for polynomial test functions, the relative error remains below $50\%$ for observations with up to a $24.5\%$ additive error ratio. Note that each value on these curves represents the average relative error for 1,000 simulated datasets with additive Gaussian error ratio $e\in[0\%,24.5\%]$. Also note that the error ratio is equivalent to additive white noise of the level $\sigma\in[0,0.8].$
  • Figure 3: Using WENDy with equations \ref{['eq:wf_sir_conpop']} we recover the transmission rate $\beta$ for the SIR model \ref{['eq:SIR']} with only observations in the infected compartment, even with large observational noise. The infected compartment dynamics are given for model \ref{['eq:SIR']} with N=10,000, $S_0=N-1$, $I_0=1$, $R_0=0$, $\beta=\frac{5.5}{N}$, $\gamma=5$. These parameter values correspond to an infection with a 5-day infectious period and a basic reproduction number $\mathcal{R}_0=1.1$, which could reasonably represent a seasonal influenza outbreak BiggerstaffCauchemezReedEtAl2014BMCInfectDis. We estimate the true value of $\beta$ using WENDy from 31 observations of the infected compartment (black dots) and recover the original dynamics (red line) observations with $e=20\%$additive observation error ratio (Right).
  • Figure 4: Using WENDy with equation \ref{['eq:weak_bld_cnc']} we find the blood concentration model to be generally practically identifiable below $11\%$ scaled noise in the data. (A.) The area in blue denotes where the model is $(e,q)$-identifiable, and the area in white denotes where the model is not $(e,q)$-identifiable. The $(e,q)$-identifiability was determined from 1,000 simulations at each respective error level, e. The black $\pmb{\mathsf{X}}$ denotes the $(5,50)-$identifiability cutoff. Because this point falls within the blue region, we can say that at a $5\%$additive observation error ratio in the data the MSE of $\widehat{w}_1,$$\widehat{w}_2$, and $\widehat{w}_3$ remain below the square of $50\%$ the parameter magnitude. The red star ⁎ marks the $(10,20)$-identifiability criterion. (B.) The relative error determined from 1,000 simulations at each respective error level, e. (C.) The proportion of $95\%$ confidence intervals at $e\in[0\%,20\%]$that contain the true value of $w_1,$$w_2,$ and $w_3$.
  • Figure 5: Using WENDy with equations \ref{['eq:wf_sir_conpop']} we find the SIR model to be generally practically identifiable for additive noise between $[0\%,200\%]$. (A.) The area in blue denotes where the model is $(e,q)$-identifiable, and the area in white denotes where the model is not $(e,q)$-identifiable. The $(e,q)$-identifiability was determined from 1,000 simulations at each respective error level, e. The black $\pmb{\mathsf{X}}$ denotes the $(50,10)-$identifiability cutoff. Because this point falls within the blue region, we can say that at a $50\%$ additive error ratio in the data, the MSE of $\widehat{\beta}$ remains below the square of $10\%$ of the magnitude of $\beta$. The red star ⁎ marks the $(10,20)$-identifiability criterion. (B.) The relative error determined from 1,000 simulations at each respective error level, e. (C.) The proportion of estimated $95\%$ confidence intervals at $e=[0\%,200\%]\%$ that contain the true value of $\beta$.
  • ...and 11 more figures

Theorems & Definitions (10)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Example 3.1
  • Example 3.2
  • Example 4.1
  • Example 4.2