Table of Contents
Fetching ...

Concordance and Discordance in Cosmology

Marco Raveri, Wayne Hu

TL;DR

This work develops a comprehensive framework of Concordance/Discordance Estimators (CDEs) built on a Gaussian Linear Model to quantify internal and cross-dataset agreement in cosmology while explicitly incorporating prior information. It analyzes GoF, evidence-ratio, and parameter-difference statistics, revealing that several LCDM tensions (notably Planck CMB vs H0 and weak lensing probes) persist and that biases in common estimators can mislead interpretations. By applying GLM-based GoF, DMAP, and KL-regularized parameter-difference tests to a broad suite of datasets (Planck, SN, BAO, WL, H0, and lensing), the paper demonstrates both robust consistencies and notable disagreements, and discusses practical considerations like non-Gaussian posteriors and prior counting. The results provide a principled path to diagnose systematics, motivate model extensions cautiously, and guide future analyses with non-Gaussian corrections and multi-dataset joint statistics to assess LCDM's global viability. The approach offers a valuable diagnostic toolkit for current and upcoming cosmological surveys where precision makes hidden biases and tensions increasingly consequential.

Abstract

The success of present and future cosmological studies is tied to the ability to detect discrepancies in complex data sets within the framework of a cosmological model. Tensions caused by the presence of unknown systematic effects need to be isolated and corrected to increase the overall accuracy of parameter constraints, while discrepancies due to new physical phenomena need to be promptly identified. We develop a full set of estimators of internal and mutual agreement and disagreement, whose strengths complement each other. These allow to take into account the effect of prior information and compute the statistical significance of both tensions and confirmatory biases. We apply them to a wide range of state of the art cosmological probes and show that these estimators can be easily used, regardless of model and data complexity. We derive a series of results that show that discrepancies indeed arise within the standard LCDM model. Several of them exceed the probability threshold of 95% and deserve a dedicated effort to understand their origin.

Concordance and Discordance in Cosmology

TL;DR

This work develops a comprehensive framework of Concordance/Discordance Estimators (CDEs) built on a Gaussian Linear Model to quantify internal and cross-dataset agreement in cosmology while explicitly incorporating prior information. It analyzes GoF, evidence-ratio, and parameter-difference statistics, revealing that several LCDM tensions (notably Planck CMB vs H0 and weak lensing probes) persist and that biases in common estimators can mislead interpretations. By applying GLM-based GoF, DMAP, and KL-regularized parameter-difference tests to a broad suite of datasets (Planck, SN, BAO, WL, H0, and lensing), the paper demonstrates both robust consistencies and notable disagreements, and discusses practical considerations like non-Gaussian posteriors and prior counting. The results provide a principled path to diagnose systematics, motivate model extensions cautiously, and guide future analyses with non-Gaussian corrections and multi-dataset joint statistics to assess LCDM's global viability. The approach offers a valuable diagnostic toolkit for current and upcoming cosmological surveys where precision makes hidden biases and tensions increasingly consequential.

Abstract

The success of present and future cosmological studies is tied to the ability to detect discrepancies in complex data sets within the framework of a cosmological model. Tensions caused by the presence of unknown systematic effects need to be isolated and corrected to increase the overall accuracy of parameter constraints, while discrepancies due to new physical phenomena need to be promptly identified. We develop a full set of estimators of internal and mutual agreement and disagreement, whose strengths complement each other. These allow to take into account the effect of prior information and compute the statistical significance of both tensions and confirmatory biases. We apply them to a wide range of state of the art cosmological probes and show that these estimators can be easily used, regardless of model and data complexity. We derive a series of results that show that discrepancies indeed arise within the standard LCDM model. Several of them exceed the probability threshold of 95% and deserve a dedicated effort to understand their origin.

Paper Structure

This paper contains 20 sections, 100 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Geometrical interpretation of the Gaussian linear model. $(x_1,x_2)$ represents data space and $m(\theta)$ a one dimensional model, i.e. a curve in the $(x_1,x_2)$ space. The figure also shows the linearization of the model and how to decompose differences between a data realization and the model (at fixed parameters) in the direction that is parallel and orthogonal to the model. $m(\theta_{\rm ML})$ shows the model corresponding to the best fit parameter values for the given data realization. The dashed line shows a constant likelihood surface, where we assumed for simplicity that data covariance is proportional to the identity matrix.
  • Figure 2: Geometrical interpretation of the GLM evidence. In all panels $(x_1,x_2)$ represents data space and $m(\theta)$ a one dimensional model, i.e. a curve in the $(x_1,x_2)$ space. The figure also shows the linearization of the model. The dashed lines correspond to the evidence contours, for different prior choices, and different confidence levels. The contours are showing that, when drawing data realizations from the evidence, these will be $68\%$ of the time inside the $68\%$ contour, $95\%$ of the time inside the $95\%$ contour and so on. As in the previous figure we assumed, for simplicity, that data covariance is proportional to the identity matrix. In the Gaussian prior case we also assumed that $m_{\Pi}=\hat{m}$.
  • Figure 3: The statistical significance of the posterior goodness of fit estimator, $Q_{\rm MAP}$ from Eq. (\ref{['Eq:MAPsummary']}), applied to different data sets and data sets combinations. The labels report different levels of statistical significance: $P_1\equiv 32\%$, $P_2\equiv 5\%$, $P_3\equiv 0.3\%$, $P_4\equiv 0.007\%$ and $P_5\equiv 0.00006\%$. The darker shade indicates results that are not statistically significant.
  • Figure 4: The evidence ratio estimator applied to different data set couples. We show the nominal observed value of the evidence ratio test and its debiased value. Notice that for most of the data sets the bias in the evidence ratio estimator is as large as its observed value. The darker shade indicates results would not be considered statistically significant on the Jeffreys' scale.
  • Figure 5: The statistical significance of different CDEs for various data set couples: the difference in log-likelihood at maximum posterior (MAP), $Q_{\rm DMAP}$ from Eq. (\ref{['Eq:DMAPsummary']}), the update parameter shifts test, $Q_{\rm UDM}$ from Eq. (\ref{['Eq:UDMsummary']}), the exact 1D parameter shifts, $T_1$ from Eq. (\ref{['Eq:T1twotailed']}), and the "rule of thumb difference in mean", as the Gaussian approximation of $T_1$ from Eq. (\ref{['Eq:1DDifferenceMean']}). Different colors indicate different tests, as shown in legend. The labels report different levels of statistical significance: $P_1\equiv 32\%$, $P_2\equiv 5\%$, $P_3\equiv 0.3\%$, $P_4\equiv 0.007\%$. Values that are identified as failure modes of one of the estimators are not shown in figure. The darker shade indicates results that are not statistically significant.
  • ...and 3 more figures