DeepRed: an architecture for redshift estimation

Alessandro Meroni; Nicolò Oreste Pinciroli Vago; Piero Fraternali

DeepRed: an architecture for redshift estimation

Alessandro Meroni, Nicolò Oreste Pinciroli Vago, Piero Fraternali

TL;DR

Redshift estimation from astronomical images is costly and constrained by dataset heterogeneity. DeepRed introduces a pipeline that ensembles generic computer-vision backbones—such as EfficientNet, Swin Transformer, and MLP-Mixer—via a linear regression ensemble on latent outputs to robustly predict redshift for galaxies, gravitational lenses, and gravitationally-lensed transients, while incorporating SHAP-based explainability. Across four simulated DeepGraviLens datasets and real KiDS and SDSS data, DeepRed achieves state-of-the-art NMAD and $\sigma_{68}$ reductions (up to $55\%$ and $50\%$, respectively) and lower bias/outlier rates, with SHAP localization exceeding $95\%$ accuracy on high-quality images. The results demonstrate strong generalization across morphologies and observational conditions, supporting scalable redshift estimation for upcoming sky surveys; future work includes cross-instrument validation and integration of time-domain data.

Abstract

Estimating redshift is a central task in astrophysics, but its measurement is costly and time-consuming. In addition, current image-based methods are often validated on homogeneous datasets. The development and comparison of networks able generalize across different morphologies, ranging from galaxies to gravitationally-lensed transients, and observational conditions, remain an open challenge. This work proposes DeepRed, a deep learning pipeline that demonstrates how modern computer vision architectures, including ResNet, EfficientNet, Swin Transformer, and MLP-Mixer, can estimate redshifts from images of galaxies, gravitational lenses, and gravitationally-lensed supernovae. We compare these architectures and their ensemble to both neural networks (A1, A3, NetZ, and PhotoZ) and a feature-based method (HOG+SVR) on simulated (DeepGraviLens) and real (KiDS, SDSS) datasets. Our approach achieves state-of-the-art results on all datasets. On DeepGraviLens, DeepRed achieves a significant improvement in the Normalized Mean Absolute Deviation compared to the best baseline (PhotoZ): 55% on DES-deep (using EfficientNet), 51% on DES-wide (Ensemble), 52% on DESI-DOT (Ensemble), and 46% on LSST-wide (Ensemble). On real observations from the KiDS survey, the pipeline outperforms the best baseline (NetZ), improving NMAD by 16% on a general test set without high-probability lenses (Ensemble) and 27% on high-probability lenses (Ensemble). For non-lensed galaxies in the SDSS dataset, the MLP-Mixer architecture achieves a 5% improvement over the best baselines (A3 and NetZ). SHAP shows that the models correctly focus on the objects of interest with over 95% localization accuracy on high-quality images, validating the reliability of the predictions. These findings suggest that deep learning is a scalable, robust, and interpretable solution for redshift estimation in large-scale surveys.

DeepRed: an architecture for redshift estimation

TL;DR

reductions (up to

and

, respectively) and lower bias/outlier rates, with SHAP localization exceeding

accuracy on high-quality images. The results demonstrate strong generalization across morphologies and observational conditions, supporting scalable redshift estimation for upcoming sky surveys; future work includes cross-instrument validation and integration of time-domain data.

Abstract

Paper Structure (34 sections, 10 equations, 20 figures, 28 tables)

This paper contains 34 sections, 10 equations, 20 figures, 28 tables.

Introduction
Related work
Positioning with respect to the state of the art
Datasets and methods
Datasets
DeepGraviLens
KiDS
SDSS
Tasks, targets and outputs
Loss functions and metrics
Uncertainty estimation
Pipeline
HOG + SVR
ResNet
EfficientNet
...and 19 more sections

Figures (20)

Figure 1: Examples of Einstein ring images in the four DGL datasets. DES-deep, DESI-DOT and DES-wide have a side of $\approx 12$ arcsec, and LSST-wide has a side of $\approx 9$ arcsec.
Figure 2: Example of high-probability lens in the KiDS dataset. Each side corresponds to $\approx 9$ arcsec.
Figure 3: Example of an image taken from SDSS. Each side corresponds to $\approx 18$ arcsec.
Figure 4: GT redshift distributions for DGL, KiDS and SDSS.
Figure 5: The DeepRed pipeline
...and 15 more figures

DeepRed: an architecture for redshift estimation

TL;DR

Abstract

DeepRed: an architecture for redshift estimation

Authors

TL;DR

Abstract

Table of Contents

Figures (20)