Table of Contents
Fetching ...

Spectral Classification and Redshift Measurement for the SDSS-III Baryon Oscillation Spectroscopic Survey

Adam S. Bolton, David J. Schlegel, Eric Aubourg, Stephen Bailey, Vaishali Bhardwaj, Joel R. Brownstein, Scott Burles, Yan-Mei Chen, Kyle Dawson, Daniel J. Eisenstein, James E. Gunn, G. R. Knapp, Craig P. Loomis, Robert H. Lupton, Claudia Maraston, Demitri Muna, Adam D. Myers, Matthew D. Olmstead, Nikhil Padmanabhan, Isabelle Paris, Will J. Percival, Patrick Petitjean, Constance M. Rockosi, Nicholas P. Ross, Donald P. Schneider, Yiping Shu, Michael A. Strauss, Daniel Thomas, Christy A. Tremonti, David A. Wake, Benjamin A. Weaver, W. Michael Wood-Vasey

TL;DR

The paper presents the DR9 implementation of the idlspec2d pipeline for SDSS-III BOSS, detailing a χ^2-minimization approach that fits spectra to expanded galaxy, quasar, and star template bases to produce automated redshift, classification, and parameter measurements. It introduces new, high-S/N galaxy, quasar, and star templates built from stacking, PCA, and spectral libraries, including velocity-dispersion and emission-line analyses; it also implements galaxy priors (Z_NOQSO/CLASS_NOQSO) to improve CMASS/LOWZ redshift success. Key results show high automated completeness for CMASS and LOWZ galaxies (≥95% in the galaxy class) with impurity rates around 0.2%, competitive galaxy redshift precision (~ tens of km s$^{-1}$), and substantial, but not perfect, quasar redshift performance (with ~79% ZWARNING=0 and ~51.5% confirmed quasars in the Ly$\alpha$-forest regime). The paper demonstrates the pipeline’s effectiveness for large-scale structure and Ly$\alpha$ forest studies, provides detailed data products and templates, and outlines known issues and avenues for future improvements, including improvements to extraction, template space, and handling of special object classes.

Abstract

(abridged) We describe the automated spectral classification, redshift determination, and parameter measurement pipeline in use for the Baryon Oscillation Spectroscopic Survey (BOSS) of the Sloan Digital Sky Survey III (SDSS-III) as of Data Release 9, encompassing 831,000 moderate-resolution optical spectra. We give a review of the algorithms employed, and describe the changes to the pipeline that have been implemented for BOSS relative to previous SDSS-I/II versions, including new sets of stellar, galaxy, and quasar redshift templates. For the color-selected CMASS sample of massive galaxies at redshift 0.4 <~ z <~ 0.8 targeted by BOSS for the purposes of large-scale cosmological measurements, the pipeline achieves an automated classification success rate of 98.7% and confirms 95.4% of unique CMASS targets as galaxies (with the balance being mostly M stars). Based on visual inspections of a subset of BOSS galaxies, we find that ~0.2% of confidently reported CMASS sample classifications and redshifts are incorrect, and ~0.4% of all CMASS spectra are objects unclassified by the current algorithm which are potentially recoverable. The BOSS pipeline confirms that ~51.5% of the quasar targets have quasar spectra, with the balance mainly consisting of stars. Statistical (as opposed to systematic) redshift errors propagated from photon noise are typically a few tens of km/s for both galaxies and quasars, with a significant tail to a few hundreds of km/s for quasars. We test the accuracy of these statistical redshift error estimates using repeat observations, finding them underestimated by a factor of 1.19 to 1.34 for galaxies, and by a factor of 2 for quasars. We assess the impact of sky-subtraction quality, S/N, and other factors on galaxy redshift success. Finally, we document known issues, and describe directions of ongoing development.

Spectral Classification and Redshift Measurement for the SDSS-III Baryon Oscillation Spectroscopic Survey

TL;DR

The paper presents the DR9 implementation of the idlspec2d pipeline for SDSS-III BOSS, detailing a χ^2-minimization approach that fits spectra to expanded galaxy, quasar, and star template bases to produce automated redshift, classification, and parameter measurements. It introduces new, high-S/N galaxy, quasar, and star templates built from stacking, PCA, and spectral libraries, including velocity-dispersion and emission-line analyses; it also implements galaxy priors (Z_NOQSO/CLASS_NOQSO) to improve CMASS/LOWZ redshift success. Key results show high automated completeness for CMASS and LOWZ galaxies (≥95% in the galaxy class) with impurity rates around 0.2%, competitive galaxy redshift precision (~ tens of km s), and substantial, but not perfect, quasar redshift performance (with ~79% ZWARNING=0 and ~51.5% confirmed quasars in the Ly-forest regime). The paper demonstrates the pipeline’s effectiveness for large-scale structure and Ly forest studies, provides detailed data products and templates, and outlines known issues and avenues for future improvements, including improvements to extraction, template space, and handling of special object classes.

Abstract

(abridged) We describe the automated spectral classification, redshift determination, and parameter measurement pipeline in use for the Baryon Oscillation Spectroscopic Survey (BOSS) of the Sloan Digital Sky Survey III (SDSS-III) as of Data Release 9, encompassing 831,000 moderate-resolution optical spectra. We give a review of the algorithms employed, and describe the changes to the pipeline that have been implemented for BOSS relative to previous SDSS-I/II versions, including new sets of stellar, galaxy, and quasar redshift templates. For the color-selected CMASS sample of massive galaxies at redshift 0.4 <~ z <~ 0.8 targeted by BOSS for the purposes of large-scale cosmological measurements, the pipeline achieves an automated classification success rate of 98.7% and confirms 95.4% of unique CMASS targets as galaxies (with the balance being mostly M stars). Based on visual inspections of a subset of BOSS galaxies, we find that ~0.2% of confidently reported CMASS sample classifications and redshifts are incorrect, and ~0.4% of all CMASS spectra are objects unclassified by the current algorithm which are potentially recoverable. The BOSS pipeline confirms that ~51.5% of the quasar targets have quasar spectra, with the balance mainly consisting of stars. Statistical (as opposed to systematic) redshift errors propagated from photon noise are typically a few tens of km/s for both galaxies and quasars, with a significant tail to a few hundreds of km/s for quasars. We test the accuracy of these statistical redshift error estimates using repeat observations, finding them underestimated by a factor of 1.19 to 1.34 for galaxies, and by a factor of 2 for quasars. We assess the impact of sky-subtraction quality, S/N, and other factors on galaxy redshift success. Finally, we document known issues, and describe directions of ongoing development.

Paper Structure

This paper contains 22 sections, 7 equations, 15 figures, 6 tables.

Figures (15)

  • Figure 1: Mosaic of representative BOSS spectra, with a resolution of $R\approx 2000$. Black lines show data (smoothed over a 5-pixel window), cyan lines show best-fit redshift/classification model, and red lines show 1-$\sigma$ noise level estimated by the extraction pipeline. Spectra are labeled by PLATE-MJD-FIBERID. Individual objects are: (a) redshift $z = 0.256$ LOWZ galaxy; (b) redshift $z = 0.649$ CMASS galaxy; (c) redshift $z = 0.669$ CMASS galaxy with post-starburst continuum; (d) redshift $z = 0.217$ starburst galaxy (from QSO target sample); (e) redshift $z = 2.873$ quasar; (f) redshift $z = 0.661$ quasar; (g) spectrophotometric standard star; (h) M star (from CMASS target sample).
  • Figure 2: Schematic illustration of the idlspec2d redshift measurement algorithm. The reduced $\chi^2$ curve as a function of redshift (black curve) is determined from the best-fit linear combination of template basis spectra at each trial redshift value. The best redshift is defined by the location of the global minimum (green). Subsidiary minima separated by less than 1000 km s$^{-1}$ are not considered to be separate (pink). The curvature of a parabolic fit to the $\chi_r^2$ curve at the global minimum (magenta) is used to determine the best-fit redshift error estimate. The second-best redshift fit is determined by the location of the second-lowest well-separated $\chi_r^2$ minimum (blue). The difference $\Delta \chi_r^2$(red) between best and second-best redshifts is used to assign confidence in the measurement, as described in the text.
  • Figure 3: BOSS redshift and classification template basis sets for galaxies (top), quasars (middle), and CV stars (bottom).
  • Figure 4: Redshift distribution of 1000 targeted (gray) and 571 observed (black) quasar training spectra. Spectra from the observed distribution are used to construct the PCA-based quasar redshift templates used for automated classification and redshift measurement in BOSS DR9 and shown in the middle panel of Figure \ref{['fig:templates']}.
  • Figure 5: Histograms of redshift differences of LOWZ (left) and CMASS (right) galaxies that are observed more than once, scaled by the quadrature sum of statistical error estimates in each epoch. Over-plotted are the best-fit Gaussian models, with a dispersion parameter of $\sigma=1.34$ for the LOWZ sample and $\sigma=1.19$ for the CMASS sample.
  • ...and 10 more figures