Table of Contents
Fetching ...

Bayesian inference for ordinary differential equations models with heteroscedastic measurement error

Selva Salimi, David J. Warne, Christopher Drovandi

Abstract

Ordinary differential equation (ODE) models are widely used to describe systems in many areas of science. To ensure these models provide accurate and interpretable representations of real-world dynamics, it is often necessary to infer parameters from data, which involves specifying the form of the ODE system as well as a statistical model describing the observational process. A popular and convenient choice for the error model is a Gaussian distribution with constant variance. However, the choice may not be realistic in many systems, since the variance of the observational error may vary over time or have some dependence on the system state (heteroscedastic), reflecting changes in measurement conditions, environmental fluctuations, or intrinsic system variability. Misspecification of the error model can lead to substantial inaccuracies of the posterior estimates of the ODE model parameters and predictions. More elaborate parametric error models could be specified, but this would increase computational cost because additional parameters would need to be estimated within the MCMC procedure and may still be misspecified. In this work we propose a two-step semi-parametric framework for Bayesian parameter estimation of ODE model parameters when there exists heteroscedasticity in the error process. The first step applies a heteroscedastic Gaussian process to estimate the time-dependent error, and the second step performs Bayesian inference for the ODE model parameters using the estimated time-dependent error estimated from step one in the likelihood function. Through a simulation study and two real-world applications, we demonstrate that the proposed approach yields more reliable posterior inference and predictive uncertainty compared to the standard homoscedastic models. Although our focus is on heteroscedasticity, the framework could be applied to handle more complex error processes.

Bayesian inference for ordinary differential equations models with heteroscedastic measurement error

Abstract

Ordinary differential equation (ODE) models are widely used to describe systems in many areas of science. To ensure these models provide accurate and interpretable representations of real-world dynamics, it is often necessary to infer parameters from data, which involves specifying the form of the ODE system as well as a statistical model describing the observational process. A popular and convenient choice for the error model is a Gaussian distribution with constant variance. However, the choice may not be realistic in many systems, since the variance of the observational error may vary over time or have some dependence on the system state (heteroscedastic), reflecting changes in measurement conditions, environmental fluctuations, or intrinsic system variability. Misspecification of the error model can lead to substantial inaccuracies of the posterior estimates of the ODE model parameters and predictions. More elaborate parametric error models could be specified, but this would increase computational cost because additional parameters would need to be estimated within the MCMC procedure and may still be misspecified. In this work we propose a two-step semi-parametric framework for Bayesian parameter estimation of ODE model parameters when there exists heteroscedasticity in the error process. The first step applies a heteroscedastic Gaussian process to estimate the time-dependent error, and the second step performs Bayesian inference for the ODE model parameters using the estimated time-dependent error estimated from step one in the likelihood function. Through a simulation study and two real-world applications, we demonstrate that the proposed approach yields more reliable posterior inference and predictive uncertainty compared to the standard homoscedastic models. Although our focus is on heteroscedasticity, the framework could be applied to handle more complex error processes.
Paper Structure (13 sections, 41 equations, 10 figures, 3 algorithms)

This paper contains 13 sections, 41 equations, 10 figures, 3 algorithms.

Figures (10)

  • Figure 1: (a) Sparse observations: With few data points, the posterior distributions under homoscedastic and heteroscedastic error assumptions are similar, as limited data provides little information to detect time-varying variance. (b) Dense observations: With more data points, heteroscedastic models capture time-varying error, revealing that homoscedastic assumptions can lead to overconfident or underconfident parameter estimates, particularly in regions of high or low measurement variability. Heteroscedastic models therefore provide more accurate uncertainty quantification across the input space
  • Figure 2: MMD comparison between the posterior samples obtained from the homoscedastic model and our approach relative to the true posterior. For each dataset size ($n = 10, 20, 50, 100, 500, 1000$), 25 datasets under the same fixed parameter values were generated by sampling the ODE solution at equally spaced time points. This produces 25 MMD values for each combination of method and dataset size, and these 25 MMD values are represented as a boxplot. Lower MMD values indicate closer alignment with the true posterior. Note that for visualization clarity, boxplots for our approach are slightly offset to the right and those for the homoscedastic model slightly offset to the left of each dataset size shown on the x-axis.
  • Figure 3: Comparison of the estimated error over time, showing the HetGP fit (blue), the homoscedastic model (green), and the true error level used in the dataset (red points).
  • Figure 4: Kernel Density Estimation (KDE) of the marginal posterior distributions for parameters $\alpha$ (growth rate), $\gamma$ (shape parameter), and $K$ (carrying capacity) under different error models. The blue curves correspond to our approach and the green curves to the homoscedastic model.
  • Figure 5: Bivariate posterior distributions for parameter pairs $(\alpha, \gamma)$, $(\alpha, K)$, $(\gamma, K)$ comparing our approach (blue) and homoscedastic (green) model.
  • ...and 5 more figures