Table of Contents
Fetching ...

Accurate spectroscopic redshift estimation using non-negative matrix factorization: application to MUSE spectra

Masten Bourahma, Nicolas F. Bouché, Roland Bacon, Johan Richard, Tanya Urrutia, Afonso Vale, Martin Wendt, T. T. Thai

Abstract

Accurate and automated galaxy redshift determination is essential for maximizing the scientific return of spectroscopic surveys. In this paper, we propose a data-driven method to address this challenge. The method first learns a rest-frame representation of galaxy spectra using Non-negative Matrix Factorization (NMF). The method then reconstructs new spectra using this representation at different trial redshifts, and identifies the correct redshift by selecting the one that minimizes the reconstruction error. We apply our method to galaxy spectra from the Multi Unit Spectroscopic Explorer (MUSE), covering redshifts from 0 to 6.7. Our method achieves an overall success rate of 93.7%. We further demonstrate two applications: (i) the separation between true and false sources, and (ii) the detection of blended sources from one-dimensional spectra. Our results demonstrate that NMF-based representations provide a powerful and physically motivated framework for redshift estimation in current and future large spectroscopic surveys.

Accurate spectroscopic redshift estimation using non-negative matrix factorization: application to MUSE spectra

Abstract

Accurate and automated galaxy redshift determination is essential for maximizing the scientific return of spectroscopic surveys. In this paper, we propose a data-driven method to address this challenge. The method first learns a rest-frame representation of galaxy spectra using Non-negative Matrix Factorization (NMF). The method then reconstructs new spectra using this representation at different trial redshifts, and identifies the correct redshift by selecting the one that minimizes the reconstruction error. We apply our method to galaxy spectra from the Multi Unit Spectroscopic Explorer (MUSE), covering redshifts from 0 to 6.7. Our method achieves an overall success rate of 93.7%. We further demonstrate two applications: (i) the separation between true and false sources, and (ii) the detection of blended sources from one-dimensional spectra. Our results demonstrate that NMF-based representations provide a powerful and physically motivated framework for redshift estimation in current and future large spectroscopic surveys.
Paper Structure (17 sections, 4 equations, 13 figures, 2 tables)

This paper contains 17 sections, 4 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: Main statistical properties of the selected MUSE galaxy sample. Distributions of (a) redshift, (b) redshift confidence score (ZCONF), (c) stellar continuum SNR ($\rm SNR_{cont}$), and (d) emission/absorption lines collective SNR ($\rm SNR_{lines}$).
  • Figure 2: MUSE galaxy spectra matrix in the rest frame. This matrix shows selected MUSE galaxy spectra, prepared for NMF decomposition. Spectra are sorted by increasing redshift (from bottom to top) and transformed to their rest frame. Key redshifts are shown on the y-axis, and important spectral lines are indicated on top of the figure. The color of each pixel encodes flux density, scaled using a 95% z-scale to enhance the visibility of emission lines. Bluer colors represent higher flux densities; white pixels denote missing or unobserved data.
  • Figure 3: Illustration of redshift prediction with NMF basis vectors. (a) shows the rest-frame spectrum of the UDF10-4 source at redshift 0.7649 in blue. This galaxy exhibits stellar continuum, strong [O ii], H$\beta$, and [O iii] spectral emission lines (their locations are indicated on top of the figure); the best NMF reconstruction is plotted in black. (b) reports the corresponding $\chi^2$ curve, the minimum happens at the true redshift (vertical red dashed line), hence, successfully predicting the correct redshift. The second minimum in the $\chi^2$ curve at $z\sim$ 4.4 corresponds to a solution in which [O ii] gets mistaken to be Lya. The values of the $\Delta \chi^2$ and $R$ metrics are also reported in (b); their values indicate a significant minimum and a very robust redshift prediction.
  • Figure 4: NMF Rank selection. The top and bottom panels show the mean GF and mean MAE with outlier rejection across the test folds, for ranks between 6 and 13. Corresponding standard deviations are shown as error bars. Both metrics are reported for spectra with ZCONF values of 2 and 3.
  • Figure 5: NMF learned basis vectors. Panel (a) shows the basis vectors obtained from a sequential rank–10 NMF decomposition applied to 80% of the sample. The index $n$ labels each basis vector. The names and rest-frame wavelengths of prominent spectral emission and absorption lines are indicated in the upper panels. Panel (b) presents zoomed-in views of the [O ii] and Ly$\alpha$ emission lines, shown in the first and second columns, respectively. In each zoom-in, the line's peak flux is normalized to one, the line rest wavelength is shown with a vertical dashed gray line, and the corresponding basis vector is indicated.
  • ...and 8 more figures