Making the most of pure parallels: Machine learning augmented photometric redshifts for sparse JWST filter sets
Kenneth J. Duncan
TL;DR
The paper tackles the challenge of photometric redshift estimation for JWST surveys with sparse filter coverage by comparing traditional template fitting (EAzY) to ML-based methods (GPz and NNpz) and exploring hybrid consensus approaches. It demonstrates that NNpz provides the strongest single-method performance up to $z\sim8$, while GPz reduces catastrophic failures when combined with templates. Hierarchical Bayesian fusion of ML and template posteriors yields robust, low-scatter photo-$z$ with reduced outliers ($\sigma_{\text{NMAD}} \approx 0.033$, $\text{OLF}_{0.15} \approx 0.063$ for $m_{\text{F444W}}<27.5$), improving reliability across redshifts. The results, applicable to PANORAMIC and BEACON pure-parallel surveys, underscore the value of ML and hybrid approaches for maximizing JWST data return, with code and notebooks made publicly available for reproducibility.
Abstract
Photometric redshifts (photo-$z$s) are an essential tool for galaxy evolution science with JWST. However, for deep surveys with more limited filter sets (i.e. $N_{\text{filt}} \sim6$) such as large pure parallel surveys, the most commonly used template-fitting based photo-$z$ approaches can yield highly confident but spurious results for high-$z$ populations of interest. The utility and legacy value of these datasets could therefore be negatively impacted. To address this challenge, we present an application of machine learning (ML) based photo-$z$ techniques to deep JWST photometric datasets. We employ two different ML algorithms, using Gaussian processes and nearest-neighbour estimates, alongside a more standard template fitting approach. We show that simple nearest-neighbour based estimates can provide more accurate photo-$z$s than template fitting out to $z\sim8$, as well as reducing the fraction of catastrophic outliers by a factor of $\sim2-3$. Additionally, `hybrid' estimates combining template and ML can yield further improvements in overall accuracy and reliability while retaining some ability to predict photo-$z$ out to $z > 10$. The nearest-neighbour only or hybrid estimates can achieve photo-$z$s with robust scatter of $σ_{\text{NMAD}}\sim0.03-0.04$ and outlier fractions of $\sim3-10\%$ between $0 < z \lesssim 8$ from just 6 NIRCam bands, with negligible additional computational costs compared to standard template fitting. Our methodology is easily adaptable to alternative datasets, filter combinations or training samples. Overall, our results highlight the potential for even simple ML techniques to enhance the scientific return of JWST pure parallel and wide-area surveys.
