Functional Estimation of the Marginal Likelihood
Omiros Papaspiliopoulos, Timothée Stumpf-Fétizon, Jonathan Weare
TL;DR
This paper develops a functional estimator for the marginal likelihood in models with high-dimensional latent parameters and low-dimensional hyperparameters, built on the EMUS umbrella-sampling framework with $p(y|\lambda)=\int p(y|\theta,\lambda)p(\theta|\lambda)\,d\theta$. A key idea is to extend grid-based EMUS to the full hyperparameter domain via a kernel-like function $f(\lambda_i,\lambda)$ so that $u(\lambda)=\sum_\ell u_\ell f(\lambda_\ell,\lambda)$, and to estimate $\hat{u}(\lambda)=\sum_\ell \hat{u}_\ell \hat{f}(\lambda_\ell,\lambda)$ with $\hat{u}(\lambda_\ell)=\hat{u}_\ell$, enabling cheap evaluation on a fine grid and gradient-based optimization. The authors prove two consistency results (fixed-grid and dense-grid), leveraging uniform laws of large numbers and, in the dense-grid case, additional smoothness and irreducibility conditions; they also relate EMUS to Gibbs sampling, Vardi, bridge sampling, and SMC, highlighting robustness to spectral gaps. Through numerical experiments on Gaussian process regression, Gaussian process classification, and crossed random effect models, the paper demonstrates accurate functional marginal-likelihood estimates and provides practical guidance on grid design, sampling allocation, and optimal design strategies.
Abstract
We propose a framework for computing, optimizing and integrating with respect to a smooth marginal likelihood in statistical models that involve high-dimensional parameters/latent variables and continuous low-dimensional hyperparameters. The method requires samples from the posterior distribution of the parameters for different values of the hyperparameters on a simulation grid and returns inference on the marginal likelihood defined everywhere on its domain, and on its functionals. We show how the method relates to many of the methods that have been used in this context, including sequential Monte Carlo, Gibbs sampling, Monte Carlo maximum likelihood, and umbrella sampling. We establish the consistency of the proposed estimators as the sampling effort increases, both when the simulation grid is kept fixed and when it becomes dense in the domain. We showcase the approach on Gaussian process regression and classification and crossed effect models.
