Table of Contents
Fetching ...

Closing the Evidence Gap: reddemcee, a Fast Adaptive Parallel Tempering Sampler

Pablo A. Peña, James S. Jenkins

TL;DR

Reddemcee addresses the challenge of accurate Bayesian model evidence estimation while maintaining efficient posterior sampling in high-dimensional, multi-modal problems. It fuses rapid ladder adaptation with three evidence estimators (TI+, SS+, H+) within an adaptive parallel tempering MCMC, enabling high throughput and reliable $\\mathcal{Z}$ estimates. Across Gaussian shells, egg-box, Rosenbrock benchmarks, and an exoplanet RV case (HD 20794), reddemcee often matches or surpasses dynamic nested sampling in evidence accuracy and typically outperforms it in sampling throughput while preserving rich posterior information. The method offers robust model comparison for complex astrophysical problems and is poised to impact Bayesian inference workflows where both accurate evidence and detailed posteriors are essential.

Abstract

Markov Chain Monte Carlo (MCMC) excels at sampling complex posteriors but traditionally lags behind nested sampling in accurate evidence estimation, which is crucial for model comparison in astrophysical problems. We introduce reddemcee, an Adaptive Parallel Tempering Ensemble Sampler, aiming to close this gap by simultaneously presenting next-generation automated temperature-ladder adaptation techniques and robust, low-bias evidence estimators. reddemcee couples an affine-invariant stretch move with five interchangeable ladder-adaptation objectives, Uniform Swap Acceptance Rate, Swap Mean Distance, Gaussian-Area Overlap, Small Gaussian Gap, and Equalised Thermodynamic Length, implemented through a common differential update rule. Three evidence estimators are provided: Curvature-aware Thermodynamic Integration (TI+), Geometric-Bridge Stepping Stones (SS+), and a novel Hybrid algorithm that blends both approaches (H+). Performance and accuracy are benchmarked on n-dimensional Gaussian Shells, Gaussian Egg-box, Rosenbrock Functions, and exoplanet radial-velocity time-series of HD 20794. Across Shells up to 15 dimensions, reddemcee presents roughly 7 times the effective sampling speed of the best dynamic nested sampling configuration. The TI+, SS+ and H+ estimators recover estimates under 3 percent error and supply realistic uncertainties with as few as six temperatures. In the HD 20794 case study, reddemcee reproduces literature model rankings and yields tighter yet consistent planetary parameters compared with dynesty, with evidence errors that track run-to-run dispersion. By unifying fast ladder adaptation with reliable evidence estimators, reddemcee delivers strong throughput and accurate evidence estimates, often matching, and occasionally surpassing, dynamic nested sampling, while preserving the rich posterior information which makes MCMC indispensable for modern Bayesian inference.

Closing the Evidence Gap: reddemcee, a Fast Adaptive Parallel Tempering Sampler

TL;DR

Reddemcee addresses the challenge of accurate Bayesian model evidence estimation while maintaining efficient posterior sampling in high-dimensional, multi-modal problems. It fuses rapid ladder adaptation with three evidence estimators (TI+, SS+, H+) within an adaptive parallel tempering MCMC, enabling high throughput and reliable estimates. Across Gaussian shells, egg-box, Rosenbrock benchmarks, and an exoplanet RV case (HD 20794), reddemcee often matches or surpasses dynamic nested sampling in evidence accuracy and typically outperforms it in sampling throughput while preserving rich posterior information. The method offers robust model comparison for complex astrophysical problems and is poised to impact Bayesian inference workflows where both accurate evidence and detailed posteriors are essential.

Abstract

Markov Chain Monte Carlo (MCMC) excels at sampling complex posteriors but traditionally lags behind nested sampling in accurate evidence estimation, which is crucial for model comparison in astrophysical problems. We introduce reddemcee, an Adaptive Parallel Tempering Ensemble Sampler, aiming to close this gap by simultaneously presenting next-generation automated temperature-ladder adaptation techniques and robust, low-bias evidence estimators. reddemcee couples an affine-invariant stretch move with five interchangeable ladder-adaptation objectives, Uniform Swap Acceptance Rate, Swap Mean Distance, Gaussian-Area Overlap, Small Gaussian Gap, and Equalised Thermodynamic Length, implemented through a common differential update rule. Three evidence estimators are provided: Curvature-aware Thermodynamic Integration (TI+), Geometric-Bridge Stepping Stones (SS+), and a novel Hybrid algorithm that blends both approaches (H+). Performance and accuracy are benchmarked on n-dimensional Gaussian Shells, Gaussian Egg-box, Rosenbrock Functions, and exoplanet radial-velocity time-series of HD 20794. Across Shells up to 15 dimensions, reddemcee presents roughly 7 times the effective sampling speed of the best dynamic nested sampling configuration. The TI+, SS+ and H+ estimators recover estimates under 3 percent error and supply realistic uncertainties with as few as six temperatures. In the HD 20794 case study, reddemcee reproduces literature model rankings and yields tighter yet consistent planetary parameters compared with dynesty, with evidence errors that track run-to-run dispersion. By unifying fast ladder adaptation with reliable evidence estimators, reddemcee delivers strong throughput and accurate evidence estimates, often matching, and occasionally surpassing, dynamic nested sampling, while preserving the rich posterior information which makes MCMC indispensable for modern Bayesian inference.

Paper Structure

This paper contains 43 sections, 36 equations, 8 figures, 14 tables.

Figures (8)

  • Figure 1: 2-d Gaussian shells likelihood. Radius $r$=$2$ and width $w$=$0.1$. Parameter boundaries are imposed as $\pm6$. Likelihood values are coloured from blue (low) to red (high) to facilitate the visualisation of the figure.
  • Figure 2: Gaussian egg-box likelihood, with 16 different modes. Likelihood values are coloured from blue (low) to red (high) to facilitate visualisation.
  • Figure 3: Probability contour of the 2-d Rosenbrock function. The inset key shows how the colours relate to the probability.
  • Figure 4: Temperature ladder evolution for the 15-d Gaussian shells in the SAR regime (top) and SMD regime (bottom). In the x-axis is the current iteration. Descending, the temperature ladder evolution $T$, the swap acceptance ratio $T_{swap}$, and the swap mean distance SMD. Colours represent each chain, where blue is the coldest ($\beta$=1), with increasing temperature to the red, where the hottest chain is omitted. The vertical black dashed line indicates where the adaptation stops.
  • Figure 5: Chain density per temperature range for the SAR and SMD methods in the 15-d Gaussian shells. The blue dashed line denotes the maximum of the SAR, with the green one representing the maximum of the SMD. Both lines also denote the regions where the TI or SS algorithms are applied in the Hybrid method.
  • ...and 3 more figures