Table of Contents
Fetching ...

Iterating marginalized Bayes maps for likelihood maximization with application to nonlinear panel models

Jesse Wheeler, Aaron J. Abkemeier, Edward L. Ionides

TL;DR

The paper tackles likelihood-based inference for high-dimensional nonlinear panel models by introducing the marginalized panel iterated filter (MPIF), which marginalizes unit-specific parameters during unit updates to curb particle depletion. MPIF preserves the iterated filtering framework while reducing the number of resampled parameters, achieving scalability with complexity on the order of $O(J M N U)$. The authors establish theoretical results in the Gaussian special case showing convergence to the MLE, and they demonstrate through simulations and a UK measles data study that MPIF yields higher likelihoods and lower Monte Carlo variability than the existing panel iterated filter (PIF). The approach broadens the practical ability to fit complex mechanistic nonlinear models to panel data, with implications for epidemiology, ecology, and other domains employing longitudinal panel observations.

Abstract

Complex dynamic systems can be investigated by fitting mechanistic stochastic dynamic models to time series data. In this context, commonly used Monte Carlo inference procedures for model selection and parameter estimation quickly become computationally unfeasible as the system dimension grows. The increasing prevalence of panel data, characterized by multiple related time series, therefore necessitates the development of inference algorithms that are effective for this class of high-dimensional mechanistic models. Nonlinear, non-Gaussian mechanistic models are routinely fitted to time series data but seldom to panel data, despite its widespread availability, suggesting that the practical difficulties for existing procedures are prohibitive. We investigate the use of iterated filtering algorithms for this purpose. We introduce a novel algorithm that contains a marginalization step that mitigates issues arising from particle filtering in high dimensions. Our approach enables likelihood-based inference for models that were previously considered intractable, thus broadening the scope of dynamic models available for panel data analysis.

Iterating marginalized Bayes maps for likelihood maximization with application to nonlinear panel models

TL;DR

The paper tackles likelihood-based inference for high-dimensional nonlinear panel models by introducing the marginalized panel iterated filter (MPIF), which marginalizes unit-specific parameters during unit updates to curb particle depletion. MPIF preserves the iterated filtering framework while reducing the number of resampled parameters, achieving scalability with complexity on the order of . The authors establish theoretical results in the Gaussian special case showing convergence to the MLE, and they demonstrate through simulations and a UK measles data study that MPIF yields higher likelihoods and lower Monte Carlo variability than the existing panel iterated filter (PIF). The approach broadens the practical ability to fit complex mechanistic nonlinear models to panel data, with implications for epidemiology, ecology, and other domains employing longitudinal panel observations.

Abstract

Complex dynamic systems can be investigated by fitting mechanistic stochastic dynamic models to time series data. In this context, commonly used Monte Carlo inference procedures for model selection and parameter estimation quickly become computationally unfeasible as the system dimension grows. The increasing prevalence of panel data, characterized by multiple related time series, therefore necessitates the development of inference algorithms that are effective for this class of high-dimensional mechanistic models. Nonlinear, non-Gaussian mechanistic models are routinely fitted to time series data but seldom to panel data, despite its widespread availability, suggesting that the practical difficulties for existing procedures are prohibitive. We investigate the use of iterated filtering algorithms for this purpose. We introduce a novel algorithm that contains a marginalization step that mitigates issues arising from particle filtering in high dimensions. Our approach enables likelihood-based inference for models that were previously considered intractable, thus broadening the scope of dynamic models available for panel data analysis.

Paper Structure

This paper contains 13 sections, 4 theorems, 58 equations, 4 figures, 1 algorithm.

Key Result

Theorem 1

Consider a PanelPOMP model defined by Eq. eq:ppomp, and let $\Theta \subset \mathbb{R}^{d_\phi + Ud_\psi}$ be a compact set that satisfies condition assumption:regular in Appendix sec:assumptions, and assume there exists a $\delta > 0$ such that $\{\theta \in \Theta: |\theta - \hat{\theta}|_2 < \del

Figures (4)

  • Figure 1: Data cloning and marginalized data cloning for two parameter model with Gaussian likelihoods and priors. The ellipses show the region of the parameter distribution that contains $95\%$ of the probability mass of the distribution. The black dashed line shows this region for the likelihood surface, and the red "x" marks the MLE. Theorem \ref{['theorem:GG']} implies that the intermediate posterior densities will converge to a point mass at the MLE.
  • Figure 2: Updating parameter distributions with a single $u = 1$ iteration of both versions of Algorithm \ref{['alg:mpif']}. (A) The total number of unique particles representing each parameter. The dashed horizontal line shows that MPIF maintains the number of particles for $\Psi_2$ over time. (B) Parameter particle swarm of a single update with and without marginalization compared to the true posterior distribution.
  • Figure 3: Comparison of the MPIF and PIF algorithms for fitting the stochastic Gompertz population model. The solid horizontal line shows the true maximum likelihood, determined via the Kalman filter and a numeric optimizer, an intractable approach for high-dimensional parameter spaces $(U > 100)$; in these cases, the dashed line indicates the likelihood at the data-generating parameters. Each algorithm used 50 unique starting points. Vertical bars span the tenth percentile to the maximum likelihood values
  • Figure 4: Log-likelihoods yielded by PIF and MPIF for the mechanistic measles model to the UK data. Rows correspond to a different number of particles $J$ used in Algorithm \ref{['alg:mpif']}. The log-likelihood is evaluated at iterations 20, 56, 92, 128, 164, and 200.

Theorems & Definitions (9)

  • Theorem 1
  • proof : Proof outline
  • Theorem 2
  • Corollary 1
  • proof
  • Lemma 1
  • proof : Proof of Lemma \ref{['lemma:bound']}
  • proof : Proof of Theorem \ref{['theorem:GG']}
  • proof