Table of Contents
Fetching ...

Nearest Neighbour-Based Statistics for 21cm-Galaxy Cross-Correlations in the Epoch of Reionization

Anirban Chakraborty, Kwanit Gangopadhyay, Arka Banerjee, Tirthankar Roy Choudhury

Abstract

21cm radiation from neutral hydrogen serves as a direct probe of the Epoch of Reionization. However, both its detection and physical interpretation are severely hindered by contamination from astrophysical foreground emission and instrumental noise that are several orders of magnitude brighter than the signal of interest. A promising way to tackle these challenges is to cross-correlate the 21cm signal with other independent tracers of large-scale structure, most notably high-redshift galaxies. Besides validating putative 21cm detections, such joint analyses are expected to provide independent insights into the properties of ionizing sources and the evolving morphology of ionized regions during reionization. The 21cm signal, however, is intrinsically highly non-Gaussian, limiting the effectiveness of conventional two-point cross-correlation statistics, which capture information only up to the second order. In this work, we therefore investigate the utility of k-nearest-neighbour cumulative distribution functions (kNN CDF), which encode information from the joint clustering at all orders, as an alternative framework for probing 21cm-galaxy cross-correlations. Using self-consistently simulated mock 21cm fields and a catalog of line-emitting galaxies at z = 7, we conducted a proof-of-concept study comparing the kNN CDF formalism and the two-point cross-correlation approach. We find that the kNN CDF statistics outperform the two-point statistics in detecting 21cm-galaxy cross-correlations, even in the presence of instrumental noise and aggressive foreground filtering. Moreover, at a fixed global ionized fraction, it is even able to differentiate between reionization models that remain indistinguishable using two-point statistics. These results demonstrate the power and unexplored potential of exploiting higher-order statistics for extracting maximal information from 21cm-galaxy synergies.

Nearest Neighbour-Based Statistics for 21cm-Galaxy Cross-Correlations in the Epoch of Reionization

Abstract

21cm radiation from neutral hydrogen serves as a direct probe of the Epoch of Reionization. However, both its detection and physical interpretation are severely hindered by contamination from astrophysical foreground emission and instrumental noise that are several orders of magnitude brighter than the signal of interest. A promising way to tackle these challenges is to cross-correlate the 21cm signal with other independent tracers of large-scale structure, most notably high-redshift galaxies. Besides validating putative 21cm detections, such joint analyses are expected to provide independent insights into the properties of ionizing sources and the evolving morphology of ionized regions during reionization. The 21cm signal, however, is intrinsically highly non-Gaussian, limiting the effectiveness of conventional two-point cross-correlation statistics, which capture information only up to the second order. In this work, we therefore investigate the utility of k-nearest-neighbour cumulative distribution functions (kNN CDF), which encode information from the joint clustering at all orders, as an alternative framework for probing 21cm-galaxy cross-correlations. Using self-consistently simulated mock 21cm fields and a catalog of line-emitting galaxies at z = 7, we conducted a proof-of-concept study comparing the kNN CDF formalism and the two-point cross-correlation approach. We find that the kNN CDF statistics outperform the two-point statistics in detecting 21cm-galaxy cross-correlations, even in the presence of instrumental noise and aggressive foreground filtering. Moreover, at a fixed global ionized fraction, it is even able to differentiate between reionization models that remain indistinguishable using two-point statistics. These results demonstrate the power and unexplored potential of exploiting higher-order statistics for extracting maximal information from 21cm-galaxy synergies.
Paper Structure (13 sections, 37 equations, 12 figures)

This paper contains 13 sections, 37 equations, 12 figures.

Figures (12)

  • Figure 1: The galaxy UV luminosity function (UVLF) $\Phi_{\rm UV}(M_{\rm UV},z)$ at $6 \leq z \leq 8$ as predicted by our fiducial model across the full simulation volume. In each panel, we also show various observational UVLF measurements Finkelstein2015Bowler2020Bouwens2021Donnan2023 using colored data points.
  • Figure 2: The galaxy $\text{~}5008\text{\normalfont\AA} [O{\sc iii}] 5008\text{\normalfont\AA}$ luminosity function (O3LF) at $z = 7.0$ predicted by our fiducial model for the randomly selected survey region having a volume of $80^3~h^{-3}~\mathrm{cMpc}^3$. We also show various observational measurements at $z\sim 7.1$ from various JWST surveys Wold2025Meyer2024Meyer2025 using colored data points.
  • Figure 3: The redshift evolution of the mass-weighted neutral hydrogen fraction predicted by our fiducial model. At redshift $z = 7$, this model corresponds to a mean mass-weighted neutral fraction of $Q^{M}_{\mathrm{HI}}\equiv \langle(1-x_{\mathrm{HII},i})~\Delta_i \rangle$ = 0.385, where the average is taken over the full simulation volume. The colored data points represent some of the latest observational measurements Davies2018Durovcikova2024Greig2022Kageura2025Mason2025Gaikwad2023Umeda2023.
  • Figure 4: A two-dimensional slice of thickness $2\,h^{-1}\,\mathrm{cMpc}$ through the galaxy survey region, showing the spatial distribution of $[O{\sc iii}]$ emitters brighter than $10^{41.5}\,\mathrm{erg\,s^{-1}}$ (shown as points and coloured by their $[O{\sc iii}]$ luminosity), overlaid on the neutral hydrogen fraction field, $x_{\mathrm{HI},i} \equiv 1 - x_{\mathrm{HII},i}$.
  • Figure 5: Effect of instrumental noise and foreground wedge filtering on real-space 21 cm maps and the power spectra of 21 cm fluctuations. Top row: Two-dimensional slices of the mean-subtracted 21 cm brightness temperature fluctuation field, $\delta T_{21}$, from the SCRIPT simulation (left), after adding thermal noise (middle), and after applying both thermal noise and foreground wedge filtering (right). The colour bars show the value of $\delta T_{21}$ in units of mK. Bottom row: The binned cylindrically averaged power spectra, $\log_{10} P_{21}(k_\parallel, k_\perp)$, of fluctuations in $\delta T_{21}$. In the right-most panel, the red dashed line marks the foreground wedge boundary, $k_\parallel = 3.27\,k_\perp$.
  • ...and 7 more figures