Exact Sampling of Gibbs Measures with Estimated Losses
David T. Frazier, Jeremias Knoblauch, Jack Jewson, Christopher Drovandi
TL;DR
This work analyzes Gibbs-bayesian posteriors built from losses that are often intractable and must be estimated via simulation. It shows that naive pseudo-marginal MCMC approaches incur a prohibitive dependence on the number of simulated losses, unless the simulation count grows with the data size. The authors prove a precise scaling result: to recover standard posterior concentration, the number of simulations must grow as $m(n) \asymp n^{1/\kappa}$, which is often impractical. To overcome this, they develop a modified zig-zag sampler (a PDMP) that draws samples from the true Gibbs measure with unbiased gradient estimators, achieving linear-in-$n$ complexity and independence from $m$, and demonstrate this approach across β-divergence and MMD examples in copula, regression, and Poisson settings. The work provides both theoretical guarantees and practical algorithms that substantially improve inference when losses are estimated via simulation.
Abstract
In recent years, the shortcomings of Bayesian posteriors as inferential devices have received increased attention. A popular strategy for fixing them has been to instead target a Gibbs measure based on losses that connect a parameter of interest to observed data. However, existing theory for such inference procedures assumes these losses are analytically available, while in many situations these losses must be stochastically estimated using pseudo-observations. In such cases, we show that when standard Markov Chain Monte Carlo algorithms are used to produce posterior samples, the resulting posterior exhibits strong dependence on the number of pseudo-observations: unless the number of pseudo-observations diverge sufficiently fast the resulting posterior will concentrate very slowly. However, we show that in many situations it is feasible to alleviate this dependence entirely using a modified piecewise deterministic Markov process (PDMP) sampler, and we formally and empirically show that these samplers produce posterior draws that have no dependence on the number of pseudo-observations used to estimate the loss within the Gibbs Measure. We apply our results to three examples that feature intractable likelihoods and model misspecification.
