Table of Contents
Fetching ...

The SDSS Imaging Pipelines

Robert Lupton, James E. Gunn, Zeljko Ivezic, Gillian R. Knapp, Stephen Kent, Naoki Yasuda

TL;DR

This paper outlines the SDSS imaging and data-processing software ecosystem and the algorithmic approaches used to extract scientific products from large imaging surveys. It presents the software architecture, including configuration management with CVS and ups, a TCL-based command interface, and a suite of imaging pipelines. It details two key algorithms: KL-based PSF modelling for spatially varying PSF and a model-fitting framework for star/galaxy separation using PSF-convolved galaxy profiles. The discussion includes practical performance notes and a critical reflection on software project management in astronomy, highlighting sociotechnical challenges and recommendations. Overall, the SDSS serves as a case study in integrating software engineering with astronomical data analysis to enable large-scale surveys and reliable morphology classification.

Abstract

We summarise the properties of the Sloan Digital Sky Survey (SDSS) project, discuss our software infrastructure, and outline the architecture of the SDSS image processing pipelines. We then discuss two of the algorithms used in the SDSS image processing; the KL-transform based modelling of the spatial variation of the PSF, and the use of galaxy models in star/galaxy separation. We conclude with the first author's personal opinions on the challenges that the astronomical community faces with major software projects.

The SDSS Imaging Pipelines

TL;DR

This paper outlines the SDSS imaging and data-processing software ecosystem and the algorithmic approaches used to extract scientific products from large imaging surveys. It presents the software architecture, including configuration management with CVS and ups, a TCL-based command interface, and a suite of imaging pipelines. It details two key algorithms: KL-based PSF modelling for spatially varying PSF and a model-fitting framework for star/galaxy separation using PSF-convolved galaxy profiles. The discussion includes practical performance notes and a critical reflection on software project management in astronomy, highlighting sociotechnical challenges and recommendations. Overall, the SDSS serves as a case study in integrating software engineering with astronomical data analysis to enable large-scale surveys and reliable morphology classification.

Abstract

We summarise the properties of the Sloan Digital Sky Survey (SDSS) project, discuss our software infrastructure, and outline the architecture of the SDSS image processing pipelines. We then discuss two of the algorithms used in the SDSS image processing; the KL-transform based modelling of the spatial variation of the PSF, and the use of galaxy models in star/galaxy separation. We conclude with the first author's personal opinions on the challenges that the astronomical community faces with major software projects.

Paper Structure

This paper contains 12 sections, 3 equations, 4 figures.

Figures (4)

  • Figure 1: A trace of the memory used while processing 121 fields (3.6Gb) of an SDSS imaging run on a single 800MHz alpha processor. A total of 165029 objects were detected and charcterised in 5 bands, giving a rate of 13.4ms/object/band for processing from raw CCD frame to reduced catalog. The figure has three lines illustrating memory usage versus time. The lower line is the memory actively in use; the middle dotted line shows the memory in heap, and the top line shows the memory allocated from the system. The difference between the upper two lines is guaranteed to be in 10Mb blocks, all except one of which is completely unused, and can safely be assumed to be swapped out to disk.
  • Figure 2: A $g'$ v. $g'-r'$ colour-magnitude diagram containing 31803 objects from SDSS commissioning data. The bottom two panels show all objects, the top left shows only stars and the top right only galaxies. The disk and halo turnoffs are clearly seen in the stellar diagram. If you are viewing this figure in colour, green points are stars; red points are galaxies classified morphologically as having deVaucouleur-like profiles; cyan points have exponential profiles; and magenta points are unclassified galaxies. In black and white, the bottom two panels are unfortunately indistinguishable
  • Figure 3: Star-Galaxy separation in the SDSS. The bottom panel shows object that are classified as stars based on their HST morphology; the top panel shows galaxies. The x-axis is the $r'$ model magnitude. The solid line shows the number of objects classified correctly by the SDSS pipeline, the (red) dotted line shows the objects misclassified. It is clear that the performance is quite good, even close to the plate limit at about 22nd.
  • Figure 4: The relationship between morphological classification, based on the ratio of the deVaucouleurs and exponential likelihoods. The x-axis is the $u'-r'$ colour, which divides the galaxies nicely into two classes, presumably early- and late-type (Strateva et al. 2001). The y-axis shows the likelihood ratio (mapped into the range $[0,1]$); above and below the plot are shown the marginal distributions of galaxies which lie outside the pair of dotted lines. The correlation of colour with morhology is clearly seen. Data is from a few square degrees of run 745.