Clustering-based redshift estimation: method and application to data
Brice Ménard, Ryan Scranton, Samuel Schmidt, Chris Morrison, Donghui Jeong, Tamas Budavari, Mubdi Rahman
TL;DR
The paper introduces a practical, data-driven method to infer the redshift distribution of arbitrary datasets by exploiting spatial cross-correlations with reference populations of known redshifts, using clustering information on all scales and local sampling in photometric space to map photometric observables onto redshift space. It formalizes the relation between sky covariance and redshift distributions, analyzes ideal and non-ideal cases, and then demonstrates the approach on real datasets including LRGs, ELGs, WISE infrared sources, and FIRST radio sources, validating consistency with independent redshift estimates where possible. The study shows the method can recover redshift distributions or at least constrain redshift ranges, particularly when distributions are narrow, and highlights practical strategies and limitations (bias evolution, broad/multi-peaked distributions) with cross-checks using multiple reference samples. Overall, clustering-based redshift estimation offers a scalable way to access the three-dimensional structure of large sky surveys, enabling richer scientific analyses in the absence of complete spectroscopy or perfect photometric redshifts.
Abstract
We present a data-driven method to infer the redshift distribution of an arbitrary dataset based on spatial cross-correlation with a reference population and we apply it to various datasets across the electromagnetic spectrum to show its potential and limitations. Our approach advocates the use of clustering measurements on all available scales, in contrast to previous works focusing only on linear scales. We also show how its accuracy can be enhanced by optimally sampling a dataset within its photometric space rather than applying the estimator globally. We show that the ultimate goal of this technique is to characterize the mapping between the space of photometric observables and redshift space as this characterization then allows us to infer the clustering-redshift p.d.f. of a single galaxy. We apply this technique to estimate the redshift distributions of luminous red galaxies and emission line galaxies from the SDSS, infrared sources from WISE and radio sources from FIRST. We show that consistent redshift distributions are found using both quasars and absorber systems as reference populations. This technique brings valuable information on the third dimension of astronomical datasets. It is widely applicable to a large range of extra-galactic surveys.
