Table of Contents
Fetching ...

Efficient sequential Bayesian inference for state-space epidemic models using ensemble data assimilation

Dhorasso Temfack, Jason Wyse

Abstract

Estimating latent epidemic states and model parameters from partially observed, noisy data remains a major challenge in infectious disease modeling. State-space formulations provide a coherent probabilistic framework for such inference, yet fully Bayesian estimation is often computationally prohibitive because evaluating the observed-data likelihood requires integration over a latent trajectory. The Sequential Monte Carlo squared (SMC$^2$) algorithm offers a principled approach for joint state and parameter inference, combining an outer SMC sampler over parameters with an inner particle filter that estimates the likelihood up to the current time point. Despite its theoretical appeal, this nested particle filter imposes substantial computational cost, limiting routine use in near-real-time outbreak response. We propose Ensemble SMC$^2$ (eSMC$^2$), a computationally efficient variant that replaces the inner particle filter with an Ensemble Kalman Filter (EnKF) to approximate the incremental likelihood at each observation time. While this substitution introduces bias via a Gaussian approximation, we mitigate finite-sample effects using an unbiased Gaussian density estimator and adapt the EnKF for epidemic data through state-dependent observation variance. This makes our approach particularly suitable for overdispersed incidence data commonly encountered in infectious disease surveillance. Simulation experiments with known ground truth and an application to 2022 United States (U.S.) monkeypox incidence data demonstrate that eSMC$^2$ achieves substantial computational gains while producing posterior estimates comparable to SMC$^2$. The method accurately recovers latent epidemic trajectories and key epidemiological parameters, providing an efficient framework for sequential Bayesian inference from imperfect surveillance data.

Efficient sequential Bayesian inference for state-space epidemic models using ensemble data assimilation

Abstract

Estimating latent epidemic states and model parameters from partially observed, noisy data remains a major challenge in infectious disease modeling. State-space formulations provide a coherent probabilistic framework for such inference, yet fully Bayesian estimation is often computationally prohibitive because evaluating the observed-data likelihood requires integration over a latent trajectory. The Sequential Monte Carlo squared (SMC) algorithm offers a principled approach for joint state and parameter inference, combining an outer SMC sampler over parameters with an inner particle filter that estimates the likelihood up to the current time point. Despite its theoretical appeal, this nested particle filter imposes substantial computational cost, limiting routine use in near-real-time outbreak response. We propose Ensemble SMC (eSMC), a computationally efficient variant that replaces the inner particle filter with an Ensemble Kalman Filter (EnKF) to approximate the incremental likelihood at each observation time. While this substitution introduces bias via a Gaussian approximation, we mitigate finite-sample effects using an unbiased Gaussian density estimator and adapt the EnKF for epidemic data through state-dependent observation variance. This makes our approach particularly suitable for overdispersed incidence data commonly encountered in infectious disease surveillance. Simulation experiments with known ground truth and an application to 2022 United States (U.S.) monkeypox incidence data demonstrate that eSMC achieves substantial computational gains while producing posterior estimates comparable to SMC. The method accurately recovers latent epidemic trajectories and key epidemiological parameters, providing an efficient framework for sequential Bayesian inference from imperfect surveillance data.

Paper Structure

This paper contains 17 sections, 32 equations, 26 figures, 4 tables, 5 algorithms.

Figures (26)

  • Figure 1: Schematic illustration of the sequential update–prediction cycle in the EnKF. Each time step alternates between an update stage, where observations $y_t$ are assimilated to refine the latent state estimate, and a prediction stage, where the ensemble is propagated forward through the system dynamics.
  • Figure 2: Flowchart of the eSMC$^2$ algorithm. Each parameter particle carries an ensemble of state particles, which are propagated using the EnKF. Weights are updated based on an EnKF-based likelihood approximation. When the ESS falls below a threshold, parameter particles are resampled and rejuvenated via a PMMH step. This procedure is repeated for every data point, allowing sequential Bayesian learning.
  • Figure 3: Example 1: Filtered estimates of simulated incidence, transmission rate, and effective reproduction number. Solid lines show the posterior mean, with shaded areas representing the 95% credible intervals. Black dots indicate the observed incidence.
  • Figure 4: Example 2: Filtered estimates of simulated incidence, transmission rate, and effective reproduction number. Solid lines show posterior means; shaded areas indicate 95% credible intervals. Observed incidence is shown as black dots.
  • Figure 5: Example 3: Filtered estimates of simulated incidence, transmission rate, and effective reproduction number. Solid lines show posterior means; shaded areas indicate 95% credible intervals. Observed incidence is shown as black dots.
  • ...and 21 more figures