Indicator Functions: Distilling the Information from Gaussian Random Fields
Andrew Repp, Ravi K. Sheth, Istvan Szapudi, Yan-Chuan Cai
TL;DR
This paper tackles the problem that Fisher information on the amplitude of the power spectrum in a Gaussian random field is finite but not evenly distributed after smoothing. It introduces indicator functions to partition the field by density and derives analytic expressions for the information in the corresponding indicator correlations $\xi_I(r)$, focusing on the $r$-range $[60,80)\,h^{-1}\mathrm{Mpc}$ and identifying that most information resides in moderately rare, high-density regions. The authors show that, for finite surveys, the information in $\xi_I$ can exceed that in the standard two-point function $\xi(r)$ and provide practical expressions for the Fisher information $\mathcal{I}_{A_z}$ on the power-spectrum amplitude, including a low-probability limit. These results offer a principled route to optimize sampling via density-based statistics (density-split/mark statistics) and have implications for robust BAO amplitude measurements and efficient cosmological inference.
Abstract
A random Gaussian density field contains a fixed amount of Fisher information on the amplitude of its power spectrum. For a given smoothing scale, however, that information is not evenly distributed throughout the smoothed field. We investigate which parts of the field contain the most information by smoothing and splitting the field into different levels of density (using the formalism of indicator functions), deriving analytic expressions for the information content of each density bin in the joint-probability distribution (given a distance separation). When we choose one particular distance regime (i.e., cells separated by $60$-$80h^{-1}$ Mpc), we find that the information in that range peaks at moderately rare densities (where the number of smoothed survey cells is roughly of order of magnitude 100). Counter-intuitively, we find that, for a finite survey volume (again at a particular distance range), indicator function analysis can outperform conventional two-point statistics while using only a fraction of the total survey cells, and we explain why. In light of recent developments in marked statistics (such as the indicator power spectrum and density-split clustering), this result elucidates how to optimize sampling for effective extraction of cosmological information.
