Table of Contents
Fetching ...

Multimodal nested sampling: an efficient and robust alternative to MCMC methods for astronomical data analysis

Farhan Feroz, M. P. Hobson

TL;DR

This work tackles the difficulty of Bayesian analysis in astronomy when posteriors are multimodal or possess degeneracies, and when computing evidences for model selection is costly. It extends nested sampling with three new algorithms based on ellipsoidal bounds and a Metropolis variant, incorporating X-means clustering, dynamic bounds, and robust handling of overlapping regions to efficiently compute evidences while delivering accurate posterior inferences. It demonstrates substantial gains in sampling efficiency and robustness on toy problems and applies the methods to Bayesian object detection in images, showing reliable global and local evidences for multiple discrete objects. The approach offers a practical, general replacement for traditional MCMC in astronomical data analysis and paves the way for public software implementations and broader applications, including cosmological parameter estimation tasks.

Abstract

In performing a Bayesian analysis of astronomical data, two difficult problems often emerge. First, in estimating the parameters of some model for the data, the resulting posterior distribution may be multimodal or exhibit pronounced (curving) degeneracies, which can cause problems for traditional MCMC sampling methods. Second, in selecting between a set of competing models, calculation of the Bayesian evidence for each model is computationally expensive. The nested sampling method introduced by Skilling (2004), has greatly reduced the computational expense of calculating evidences and also produces posterior inferences as a by-product. This method has been applied successfully in cosmological applications by Mukherjee et al. (2006), but their implementation was efficient only for unimodal distributions without pronounced degeneracies. Shaw et al. (2007), recently introduced a clustered nested sampling method which is significantly more efficient in sampling from multimodal posteriors and also determines the expectation and variance of the final evidence from a single run of the algorithm, hence providing a further increase in efficiency. In this paper, we build on the work of Shaw et al. and present three new methods for sampling and evidence evaluation from distributions that may contain multiple modes and significant degeneracies; we also present an even more efficient technique for estimating the uncertainty on the evaluated evidence. These methods lead to a further substantial improvement in sampling efficiency and robustness, and are applied to toy problems to demonstrate the accuracy and economy of the evidence calculation and parameter estimation. Finally, we discuss the use of these methods in performing Bayesian object detection in astronomical datasets.

Multimodal nested sampling: an efficient and robust alternative to MCMC methods for astronomical data analysis

TL;DR

This work tackles the difficulty of Bayesian analysis in astronomy when posteriors are multimodal or possess degeneracies, and when computing evidences for model selection is costly. It extends nested sampling with three new algorithms based on ellipsoidal bounds and a Metropolis variant, incorporating X-means clustering, dynamic bounds, and robust handling of overlapping regions to efficiently compute evidences while delivering accurate posterior inferences. It demonstrates substantial gains in sampling efficiency and robustness on toy problems and applies the methods to Bayesian object detection in images, showing reliable global and local evidences for multiple discrete objects. The approach offers a practical, general replacement for traditional MCMC in astronomical data analysis and paves the way for public software implementations and broader applications, including cosmological parameter estimation tasks.

Abstract

In performing a Bayesian analysis of astronomical data, two difficult problems often emerge. First, in estimating the parameters of some model for the data, the resulting posterior distribution may be multimodal or exhibit pronounced (curving) degeneracies, which can cause problems for traditional MCMC sampling methods. Second, in selecting between a set of competing models, calculation of the Bayesian evidence for each model is computationally expensive. The nested sampling method introduced by Skilling (2004), has greatly reduced the computational expense of calculating evidences and also produces posterior inferences as a by-product. This method has been applied successfully in cosmological applications by Mukherjee et al. (2006), but their implementation was efficient only for unimodal distributions without pronounced degeneracies. Shaw et al. (2007), recently introduced a clustered nested sampling method which is significantly more efficient in sampling from multimodal posteriors and also determines the expectation and variance of the final evidence from a single run of the algorithm, hence providing a further increase in efficiency. In this paper, we build on the work of Shaw et al. and present three new methods for sampling and evidence evaluation from distributions that may contain multiple modes and significant degeneracies; we also present an even more efficient technique for estimating the uncertainty on the evaluated evidence. These methods lead to a further substantial improvement in sampling efficiency and robustness, and are applied to toy problems to demonstrate the accuracy and economy of the evidence calculation and parameter estimation. Finally, we discuss the use of these methods in performing Bayesian object detection in astronomical datasets.

Paper Structure

This paper contains 32 sections, 24 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: Proper thermodynamic integration requires the log-likelihood to be concave like (a), not (b).
  • Figure 2: Cartoon illustrating (a) the posterior of a two dimensional problem; and (b) the transformed $L(X)$ function where the prior volumes $X_i$ are associated with each likelihood $L_i$.
  • Figure 3: Cartoon of ellipsoidal nested sampling from a simple bimodal distribution. In (a) we see that the ellipsoid represents a good bound to the active region. In (b)-(d), as we nest inward we can see that the acceptance rate will rapidly decrease as the bound steadily worsens. Figure (e) illustrates the increase in efficiency obtained by sampling from each clustered region separately.
  • Figure 4: If the ellipsoids corresponding to different modes are overlapping then sampling from one ellipsoid, enclosing all the points, can be quite inefficient. Multiple overlapping ellipsoids present a better approximation to the iso-likelihood contour of a multimodal distribution.
  • Figure 5: Cartoon of the sub-clustering approach used to deal with degeneracies. The true iso-likelihood contour contains the shaded region. The large enclosing ellipse is typical of that constructed using our basic method, whereas sub-clustering produces the set of small ellipses.
  • ...and 6 more figures