Myths and Truths Concerning Estimation of Power Spectra
G. Efstathiou
TL;DR
The paper addresses the high computational cost of full maximum-likelihood power-spectrum estimation for large CMB data sets and argues that fast pseudo-${C_ll}$ methods, when paired with an accurate covariance model, suffice for practical parameter estimation. It develops a hybrid estimator that uses a low-${\ell}$ QML (near-ML) component and high-${\ell}$ PCL (fast) components with multiple weights, together with analytic and exact covariance expressions, validated by simulations. The key contribution is showing that this hybrid estimator is nearly statistically equivalent to a full ML solution but with drastically reduced computation, enabling robust cosmological inference via Monte Carlo chains. The approach is generalizable to CMB polarization and other tracers like galaxy clustering and weak lensing, and it highlights the critical role of covariance (and cross-covariances) in reliable parameter estimation under realistic noise and sky-cut conditions.
Abstract
It is widely believed that maximum likelihood estimators must be used to provide optimal estimates of power spectra. Since such estimators require require of order N_d^3 operations they are computationally prohibitive for N_d greater than a few tens of thousands. Because of this, a large and inhomogeneous literature exists on approximate methods of power spectrum estimation. These range from manifestly sub-optimal, but computationally fast methods, to near optimal but computationally expensive methods. Furthermore, much of this literature concentrates on the power spectrum estimates rather than the equally important problem of deriving an accurate covariance matrix. In this paper, I consider the problem of estimating the power spectrum of cosmic microwave background (CMB) anisotropies from large data sets. Various analytic results on power spectrum estimators are derived, or collated from the literature, and tested against numerical simulations. An unbiased hybrid estimator is proposed that combines a maximum likelihood estimator at low multipoles and pseudo-C_\ell estimates at high multipoles. The hybrid estimator is computationally fast, nearly optimal over the full range of multipoles, and returns an accurate and nearly diagonal covariance matrix for realistic experimental configurations (provided certain conditions on the noise properties of the experiment are satisfied). It is argued that, in practice, computationally expensive methods that approximate the N_d^3 maximum likelihood solution are unlikely to improve on the hybrid estimator, and may actually perform worse. The results presented here can be generalised to CMB polarization and to power spectrum estimation using other types of data, such as galaxy clustering and weak gravitational lensing.
