Table of Contents
Fetching ...

The Clustering of Galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: Including covariance matrix errors

Will J. Percival, Ashley J. Ross, Ariel G. Sanchez, Lado Samushia, Angela Burden, Robert Crittenden, Antonio J. Cuesta, Mariana Vargas Magana, Marc Manera, Florian Beutler, Chia-Hsun Chuang, Daniel J. Eisenstein, Shirley Ho, Cameron K. McBride, Francesco Montesano, Nikhil Padmanabhan, Beth Reid, Shun Saito, Donald P. Schneider, Hee-Jong Seo, Rita Tojeiro, Benjamin A. Weaver

TL;DR

The paper analyzes how finite mock-based covariance estimates propagate into parameter errors in BOSS 2-point statistics, using a Gaussian likelihood with inverse covariance $\Psi^t$ and a Wishart framework. It derives two key corrections, $m_1$ and $m_2$, to yield unbiased likelihood- and distribution-based parameter errors, and validates them with extensive Monte-Carlo tests. Applying the corrected formalism to DR9–DR11 BAO and RSD measurements, it demonstrates significant, model-dependent adjustments to reported uncertainties and provides practical guidance on optimal binning. The work underscores the necessity of properly accounting for covariance-matrix uncertainties in current and future large-scale structure analyses to ensure robust cosmological inferences.

Abstract

We present improved methodology for including covariance matrices in the error budget of Baryon Oscillation Spectroscopic Survey (BOSS) galaxy clustering measurements, revisiting Data Release 9 (DR9) analyses, and describing a method that is used in DR10/11 analyses presented in companion papers. The precise analysis method adopted is becoming increasingly important, due to the precision that BOSS can now reach: even using as many as 600 mock catalogues to estimate covariance of 2-point clustering measurements can still lead to an increase in the errors of ~20%, depending on how the cosmological parameters of interest are measured. In this paper we extend previous work on this contribution to the error budget, deriving formulae for errors measured by integrating over the likelihood, and to the distribution of recovered best-fit parameters fitting the simulations also used to estimate the covariance matrix. Both are situations that previous analyses of BOSS have considered. We apply the formulae derived to Baryon Acoustic Oscillation (BAO) and Redshift-Space Distortion (RSD) measurements from BOSS in our companion papers. To further aid these analyses, we consider the optimum number of bins to use for 2-point measurements using the monopole power spectrum or correlation function for BAO, and the monopole and quadrupole moments of the correlation function for anisotropic-BAO and RSD measurements.

The Clustering of Galaxies in the SDSS-III Baryon Oscillation Spectroscopic Survey: Including covariance matrix errors

TL;DR

The paper analyzes how finite mock-based covariance estimates propagate into parameter errors in BOSS 2-point statistics, using a Gaussian likelihood with inverse covariance and a Wishart framework. It derives two key corrections, and , to yield unbiased likelihood- and distribution-based parameter errors, and validates them with extensive Monte-Carlo tests. Applying the corrected formalism to DR9–DR11 BAO and RSD measurements, it demonstrates significant, model-dependent adjustments to reported uncertainties and provides practical guidance on optimal binning. The work underscores the necessity of properly accounting for covariance-matrix uncertainties in current and future large-scale structure analyses to ensure robust cosmological inferences.

Abstract

We present improved methodology for including covariance matrices in the error budget of Baryon Oscillation Spectroscopic Survey (BOSS) galaxy clustering measurements, revisiting Data Release 9 (DR9) analyses, and describing a method that is used in DR10/11 analyses presented in companion papers. The precise analysis method adopted is becoming increasingly important, due to the precision that BOSS can now reach: even using as many as 600 mock catalogues to estimate covariance of 2-point clustering measurements can still lead to an increase in the errors of ~20%, depending on how the cosmological parameters of interest are measured. In this paper we extend previous work on this contribution to the error budget, deriving formulae for errors measured by integrating over the likelihood, and to the distribution of recovered best-fit parameters fitting the simulations also used to estimate the covariance matrix. Both are situations that previous analyses of BOSS have considered. We apply the formulae derived to Baryon Acoustic Oscillation (BAO) and Redshift-Space Distortion (RSD) measurements from BOSS in our companion papers. To further aid these analyses, we consider the optimum number of bins to use for 2-point measurements using the monopole power spectrum or correlation function for BAO, and the monopole and quadrupole moments of the correlation function for anisotropic-BAO and RSD measurements.

Paper Structure

This paper contains 13 sections, 25 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Estimated variance for the mean of $n_b$ independent standard Gaussian random variables. The symbols show the estimated variance, averaged over $10^5$ runs, each using $n_s$ data vectors to calculate the covariance matrix. Solid circles show the average variance, calculated from the $n_s$ likelihood distributions derived fitting to $n_s$ independent data vectors (see Section \ref{['sec:like']}), open circles from the distribution of best-fit solutions recovered from these data (see Section \ref{['sec:full_err']}), and the solid triangles from the distribution of best-fit solutions when the same data used to estimate the covariance matrix is fitted (see Section \ref{['sec:dist_same_data']}). No corrections were applied to these estimates - i.e., we assumed that parameters $A$, $B$ or $D$ were zero when making these variance estimates. The lines show the true data-only variance (solid), and the result after including the first order theoretical corrections to the variance from the covariance matrix contribution (dot-dash), the average variance estimated naively from the likelihood (dashed) and from the distribution of data values that were also used to calculate the covariance matrix (dotted).
  • Figure 2: Top panels: Recovered errors from the best-fit values of $\alpha$ calculated by fitting the BAO as described in anderson12, but for the BOSS DR10 (left) and DR11 (right) mock samples manera13b. Solid circles and the solid line were determined from the likelihood, as described in Section \ref{['sec:like']}, while open circles and the dashed line were calculated from the distribution of values recovered from the mocks as described in Section \ref{['sec:dist_same_data']}. The points represent the "raw", uncorrected values, while the lines show the values after correcting for the covariance matrix. Lower panel: Percentage error on the mean value of $\alpha$ recovered from the mocks.
  • Figure 3: As Fig. \ref{['fig:pk_results']}, but now for the fits to the correlation function.
  • Figure 4: As Fig. \ref{['fig:pk_results']}, but now for BAO fits to monopole and quadrupole moments of the correlation function as described in aardwolf13, now allowing for a different dilation of scale in the radial ($\alpha_\parallel$) and angular ($\alpha_\perp$) directions.
  • Figure 5: As Fig. \ref{['fig:pk_results']}, but now for RSD fits to monopole and quadrupole moments of the correlation function as described in samushia13.