Scalable simulation-based inference for implicitly defined models using a metamodel for Monte Carlo log-likelihood estimator
Joonha Park
TL;DR
This paper addresses scalable parameter inference for implicitly defined stochastic models where exact likelihoods are unavailable. It introduces a simulation-based metamodel for the Monte Carlo log-likelihood estimator and proves a local asymptotic normality property for the mean log-likelihood $\mu(\theta;y)$, enabling principled uncertainty quantification. The method yields a MESLE and a simulation-based proxy, with a quadratic metamodel whose curvature matrices $K_1(\theta_0)$ and $K_2(\theta_0)$ drive inference and uncertainty estimates, and it includes automatic tuning and near-optimal design strategies. Across gamma-Poisson, stochastic volatility, and SEIR measles examples, the approach achieves accurate, scalable inference and favorable comparisons to pseudo-marginal MCMC, with practical tools implemented in the R package sbim.
Abstract
Models implicitly defined through a random simulator of a process have become widely used in scientific and industrial applications in recent years. However, simulation-based inference methods for such implicit models, like approximate Bayesian computation (ABC), often scale poorly as data size increases. We develop a scalable inference method for implicitly defined models using a metamodel for the Monte Carlo log-likelihood estimator derived from simulations. This metamodel characterizes both statistical and simulation-based randomness in the distribution of the log-likelihood estimator across different parameter values. Our metamodel-based method quantifies uncertainty in parameter estimation in a principled manner, leveraging the local asymptotic normality of the mean function of the log-likelihood estimator. We apply this method to construct accurate confidence intervals for parameters of partially observed Markov process models where the Monte Carlo log-likelihood estimator is obtained using the bootstrap particle filter. We numerically demonstrate that our method enables accurate and highly scalable parameter inference across several examples, including a mechanistic compartment model for infectious diseases.
