Table of Contents
Fetching ...

Karhunen-Loeve eigenvalue problems in cosmology: how should we tackle large data sets?

Max Tegmark, Andy Taylor, Alan Heavens

TL;DR

Karhunen-Loève (KL) eigenvalue methods offer principled data compression for cosmological parameter estimation in the era of huge CMB maps and galaxy surveys. The paper develops a general KL framework, extends it to multi-parameter inference via Singular Value Decomposition (SVD), and analyzes both linear and nonlinear precompression strategies, including the signal-to-noise eigenmode approach and quadratic compression. It demonstrates the approach on CMB and IRAS data, showing that essential cosmological information can be preserved with roughly an order-of-magnitude data reduction and that precompression is essential for next-generation datasets. The results indicate that near-maximal parameter precision is achievable in practice with substantial computational savings, while highlighting remaining challenges in precompression design and incomplete sky coverage.

Abstract

Since cosmology is no longer "the data-starved science", the problem of how to best analyze large data sets has recently received considerable attention, and Karhunen-Loeve eigenvalue methods have been applied to both galaxy redshift surveys and Cosmic Microwave Background (CMB) maps. We present a comprehensive discussion of methods for estimating cosmological parameters from large data sets, which includes the previously published techniques as special cases. We show that both the problem of estimating several parameters jointly and the problem of not knowing the parameters a priori can be readily solved by adding an extra singular value decomposition step. It has recently been argued that the information content in a sky map from a next generation CMB satellite is sufficient to measure key cosmological parameters (h, Omega, Lambda, etc) to an accuracy of a few percent or better - in principle. In practice, the data set is so large that both a brute force likelihood analysis and a direct expansion in signal-to-noise eigenmodes will be computationally unfeasible. We argue that it is likely that a Karhunen-Loeve approach can nonetheless measure the parameters with close to maximal accuracy, if preceded by an appropriate form of quadratic "pre-compression". We also discuss practical issues regarding parameter estimation from present and future galaxy redshift surveys, and illustrate this with a generalized eigenmode analysis of the IRAS 1.2 Jy survey optimized for measuring beta=Omega^{0.6}/b using redshift space distortions.

Karhunen-Loeve eigenvalue problems in cosmology: how should we tackle large data sets?

TL;DR

Karhunen-Loève (KL) eigenvalue methods offer principled data compression for cosmological parameter estimation in the era of huge CMB maps and galaxy surveys. The paper develops a general KL framework, extends it to multi-parameter inference via Singular Value Decomposition (SVD), and analyzes both linear and nonlinear precompression strategies, including the signal-to-noise eigenmode approach and quadratic compression. It demonstrates the approach on CMB and IRAS data, showing that essential cosmological information can be preserved with roughly an order-of-magnitude data reduction and that precompression is essential for next-generation datasets. The results indicate that near-maximal parameter precision is achievable in practice with substantial computational savings, while highlighting remaining challenges in precompression design and incomplete sky coverage.

Abstract

Since cosmology is no longer "the data-starved science", the problem of how to best analyze large data sets has recently received considerable attention, and Karhunen-Loeve eigenvalue methods have been applied to both galaxy redshift surveys and Cosmic Microwave Background (CMB) maps. We present a comprehensive discussion of methods for estimating cosmological parameters from large data sets, which includes the previously published techniques as special cases. We show that both the problem of estimating several parameters jointly and the problem of not knowing the parameters a priori can be readily solved by adding an extra singular value decomposition step. It has recently been argued that the information content in a sky map from a next generation CMB satellite is sufficient to measure key cosmological parameters (h, Omega, Lambda, etc) to an accuracy of a few percent or better - in principle. In practice, the data set is so large that both a brute force likelihood analysis and a direct expansion in signal-to-noise eigenmodes will be computationally unfeasible. We argue that it is likely that a Karhunen-Loeve approach can nonetheless measure the parameters with close to maximal accuracy, if preceded by an appropriate form of quadratic "pre-compression". We also discuss practical issues regarding parameter estimation from present and future galaxy redshift surveys, and illustrate this with a generalized eigenmode analysis of the IRAS 1.2 Jy survey optimized for measuring beta=Omega^{0.6}/b using redshift space distortions.

Paper Structure

This paper contains 30 sections, 62 equations, 5 figures.

Figures (5)

  • Figure 1: The derivatives of the CDM power spectrum with respect to various parameters.
  • Figure 2: KL-eigenvalues $\lambda=1/\Delta\beta$.
  • Figure 3: Error bar on beta $\beta$ as a function of $n'$, the number of eigenmodes used.
  • Figure 4: The three heavy lines show the smallest error bars attainable for the three parameters $Q$, $n$ and $\tau$ as a function of the number of modes used, i.e., the error bars obtained when using the separately optimized KL-modes for each parameter. The three thin lines show the error bars obtained when using the 500 SVD modes, illustrating that these contain essentially all the relevant information about all three parameters.
  • Figure 5: The error bars on the power spectrum normalization are shown for hypothetical COBE experiments with different noise levels. From top to bottom, they correspond to a noise enhancement factor 10, the real 4 year data, and noise reduction factors of 10, 100 and 1000.