Statistically Efficient Bayesian Sequential Experiment Design via Reinforcement Learning with Cross-Entropy Estimators
Tom Blau, Iadine Chades, Amir Dezfouli, Daniel Steinberg, Edwin V. Bonilla
TL;DR
This work tackles the challenge of statistically efficient Bayesian sequential experiment design by introducing the sequential cross-entropy estimator (sCEE), a lower bound on the expected information gain that avoids the exponential sample complexity of contrastive estimators. By parameterising a flexible posterior with conditional normalising flows and embedding it into a reinforcement learning framework (RL-sCEE), the method learns amortised, non-myopic design policies capable of handling continuous and discrete designs as well as implicit likelihoods. Empirical results across synthetic and realistic tasks show that RL-sCEE can achieve higher information gains with favorable sample efficiency, often outperforming state-of-the-art baselines. The approach offers a flexible, scalable path for efficient experimental design in settings where likelihoods may be intractable or expensive to evaluate.
Abstract
Reinforcement learning can learn amortised design policies for designing sequences of experiments. However, current amortised methods rely on estimators of expected information gain (EIG) that require an exponential number of samples on the magnitude of the EIG to achieve an unbiased estimation. We propose the use of an alternative estimator based on the cross-entropy of the joint model distribution and a flexible proposal distribution. This proposal distribution approximates the true posterior of the model parameters given the experimental history and the design policy. Our method overcomes the exponential-sample complexity of previous approaches and provide more accurate estimates of high EIG values. More importantly, it allows learning of superior design policies, and is compatible with continuous and discrete design spaces, non-differentiable likelihoods and even implicit probabilistic models.
