How to evaluate the sufficiency and complementarity of summary statistics for cosmic fields: an information-theoretic perspective
Ce Sui, Yi Mao, Xiaosheng Zhao, Tao Jing, Benjamin D. Wandelt
TL;DR
The paper addresses how to quantify information content and complementarity of summary statistics for cosmic fields using mutual information $I(\theta;x)$ and conditional MI. It introduces a principled framework for information sufficiency and decomposition into shared and complementary information, with variational MI estimation using a flexible model $q(\theta|s)$ (e.g., masked autoregressive flow). The authors validate the approach on a Gaussian CMB-like field where the power spectrum $P(k)$ captures essentially all information, and on non-Gaussian 21 cm maps where the wavelet scattering transform (ST) provides the most information and substantial complementarity to $P(k)$ and the bispectrum. They demonstrate that MI can guide the design and evaluation of summaries, and discuss practical estimation challenges in high-dimensional field data, pointing toward learning summaries that maximize MI.
Abstract
The advent of increasingly advanced surveys and cosmic tracers has motivated the development of new inference techniques and novel approaches to extracting information from cosmic fields. A central challenge in this endeavor is to quantify the information content carried by these summary statistics in cosmic fields. In particular, how should we assess which statistics are more informative than others and assess the exact degree of complementarity of the information from each statistic? Here, we introduce mutual information (MI) that provides, from an information-theoretic perspective, a natural framework for assessing the sufficiency and complementarity of summary statistics in cosmological data. We demonstrate how MI can be applied to typical inference tasks to make information-theoretic evaluations, using two representative examples: the cosmic microwave background map, from which the power spectrum extracts almost all information as is expected for a Gaussian random field, and the 21~cm brightness temperature map, from which the scattering transform extracts the most non-Gaussian information but is complementary to power spectrum and bispectrum. Our results suggest that MI offers a robust theoretical foundation for evaluating and improving summaries, thereby enabling a deeper understanding of cosmic fields from an information-theoretic perspective.
