Unifying Summary Statistic Selection for Approximate Bayesian Computation
Till Hoffmann, Jukka-Pekka Onnela
TL;DR
This work shows that minimizing the expected posterior entropy (EPE) under the prior predictive provides a unifying framework for learning informative summary statistics in approximate Bayesian computation. It demonstrates that many established information-theoretic approaches are equivalent to, or special cases of, EPE minimization, and it proposes practical conditional density-estimation methods (e.g., mixture density networks) to automatically learn high-fidelity summaries. Through three diverse problems—a multimodal benchmark, a population-genetics model, and a dynamic network model of growing trees—the authors show that EPE-based summaries can yield posterior inferences competitive with dedicated likelihood-based approaches, offering a powerful general tool for likelihood-free inference. The study also clarifies key distinctions among sufficient, lossless, and optimal summaries and provides guidance on when and how to apply EPE-based compression in practice.
Abstract
Extracting low-dimensional summary statistics from large datasets is essential for efficient (likelihood-free) inference. We characterize three different classes of summaries and demonstrate their importance for correctly analyzing dimensionality reduction algorithms. We demonstrate that minimizing the expected posterior entropy (EPE) under the prior predictive distribution of the model provides a unifying principle that subsumes many existing methods; they are shown to be equivalent to, or special or limiting cases of, minimizing the EPE. We offer a unifying framework for obtaining informative summaries and propose a practical method using conditional density estimation to learn high-fidelity summaries automatically. We evaluate this approach on diverse problems, including a challenging benchmark model with a multi-modal posterior, a population genetics model, and a dynamic network model of growing trees. The results show that EPE-minimizing summaries can lead to posterior inference that is competitive with, and in some cases superior to, dedicated likelihood-based approaches, providing a powerful and general tool for practitioners.
