Deep Learning Improves Photometric Redshifts in All Regions of Color Space
Emma R. Moran, Brett H. Andrews, Jeffrey A. Newman, Biprateep Dey
TL;DR
This work tackles the challenge of accurate photometric redshifts across the full color space relevant to large surveys by comparing image-based deep learning photo-$z$ methods with traditional photometry-based ML on SDSS MGS data. By partitioning color space with a self-organizing map, it reveals that deep learning substantially reduces attenuation bias and scatter in most regions, especially for galaxies with varying star-formation histories, due to exploiting pixel-level color information. The study combines global performance metrics with per-cell analyses and supports its conclusions with Monte Carlo experiments inspired by attenuation-bias theory. The findings have practical significance for upcoming surveys (e.g., Euclid, LSST, Roman) by guiding the design of photo-$z$ pipelines that robustly handle local color-space variations and complex galaxy morphologies.
Abstract
Photometric redshifts (photo-$z$'s) are crucial for the cosmology, galaxy evolution, and transient science drivers of next-generation imaging facilities like the Euclid Mission, the Rubin Observatory, and the Nancy Grace Roman Space Telescope. Previous work has shown that image-based deep learning photo-$z$ methods produce smaller scatter than photometry-based classical machine learning (ML) methods on the Sloan Digital Sky Survey (SDSS) Main Galaxy Sample, a testbed photo-$z$ dataset. However, global assessments can obscure local trends. To explore this possibility, we used a self-organizing map (SOM) to cluster SDSS galaxies based on their $ugriz$ colors. Deep learning methods achieve lower photo-$z$ scatter than classical ML methods for all SOM cells. The fractional reduction in scatter is roughly constant across most of color space with the exception of the most bulge-dominated and reddest cells where it is smaller in magnitude. Interestingly, classical ML photo-$z$'s suffer from a significant color-dependent attenuation bias, where photo-$z$'s for galaxies within a SOM cell are systematically biased towards the cell's mean spectroscopic redshift and away from extreme values, which is not readily apparent when all objects are considered. In contrast, deep learning photo-$z$'s suffer from very little color-dependent attenuation bias. The increased attenuation bias for classical ML photo-$z$ methods is the primary reason why they exhibit larger scatter than deep learning methods. This difference can be explained by the deep learning methods weighting redshift information from the individual pixels of a galaxy image more optimally than integrated photometry.
