Clarifying the Hubble constant tension with a Bayesian hierarchical model of the local distance ladder
Stephen M. Feeney, Daniel J. Mortlock, Niccolò Dalmasso
TL;DR
This paper reframes the local distance ladder as a Bayesian hierarchical model to robustly infer the Hubble constant $H_0$ by propagating all uncertainties from anchors to Cepheids to SNe. It demonstrates that non-Gaussian, heavy-tailed likelihoods for anchors, Cepheids, and SNe are essential to accurately capture tails of the $H_0$ posterior, which in turn affects model comparison with ΛCDM. Through Hamiltonian Monte Carlo sampling of a ~3000-parameter joint posterior, the authors find $H_0$ near $72.7$ km s$^{-1}$ Mpc$^{-1}$ for outlier-clean data and about $73.1$ km s$^{-1}$ Mpc$^{-1}$ when SN outliers are included, with Bayesian evidence indicating only modest support for deviations from ΛCDM depending on Planck datasets. The work provides a principled, extensible framework that reduces ad hoc data cuts, enables tail-aware hypothesis testing, and can be extended to incorporate additional datasets (e.g., Gaia) and more realistic outlier models.
Abstract
Estimates of the Hubble constant, $H_0$, from the distance ladder and the cosmic microwave background (CMB) differ at the $\sim$3-$σ$ level, indicating a potential issue with the standard $Λ$CDM cosmology. Interpreting this tension correctly requires a model comparison calculation depending on not only the traditional `$n$-$σ$' mismatch but also the tails of the likelihoods. Determining the form of the tails of the local $H_0$ likelihood is impossible with the standard Gaussian least-squares approximation, as it requires using non-Gaussian distributions to faithfully represent anchor likelihoods and model outliers in the Cepheid and supernova (SN) populations, and simultaneous fitting of the full distance-ladder dataset to correctly propagate uncertainties. We have developed a Bayesian hierarchical model that describes the full distance ladder, from nearby geometric anchors through Cepheids to Hubble-Flow SNe. This model does not rely on any distributions being Gaussian, allowing outliers to be modeled and obviating the need for arbitrary data cuts. Sampling from the $\sim$3000-parameter joint posterior using Hamiltonian Monte Carlo, we find $H_0$ = (72.72 $\pm$ 1.67) ${\rm km\,s^{-1}\,Mpc^{-1}}$ when applied to the outlier-cleaned Riess et al. (2016) data, and ($73.15 \pm 1.78$) ${\rm km\,s^{-1}\,Mpc^{-1}}$ with SN outliers reintroduced. Our high-fidelity sampling of the low-$H_0$ tail of the distance-ladder likelihood allows us to apply Bayesian model comparison to assess the evidence for deviation from $Λ$CDM. We set up this comparison to yield a lower limit on the odds of the underlying model being $Λ$CDM given the distance-ladder and Planck XIII (2016) CMB data. The odds against $Λ$CDM are at worst 10:1 or 7:1, depending on whether the SNe outliers are cut or modeled, or 60:1 if an approximation to the Planck Int. XLVI (2016) likelihood is used.
