The Gravitational-wave Optical Transient Observer (GOTO) data pipeline and workflow for transient discovery

J. D. Lyman; D. O'Neill; T. Killestein; D. Jarvis; A. Kumar; K. Ulaczyk; K. Ackley; P. Chote; M. J. Dyer; M. Pursiainen; D. Steeghs; B. Godson; M. Magee; J. R. Mullaney; B. Warwick; S. Belkin; D. K. Galloway; G. Ramsay; V. S. Dhillon; P. O'Brien; K. Noysena; R. Kotak; R. P. Breton; L. K. Nuttall; B. Gompertz; D. Pollacco; J. Casares; D. L. Coppejans; R. A. J. Eyles-Ferris; O. Graur; L. Kelsey; M. R. Kennedy; A. Levan; S. Littlefair; S. Mandhai; D. Mata Sánchez; S. Mattila; J. McCormac; S. Moran; C. Phillips; K. Pu; A. Sahu; M. Shrestha; E. Stanway; R. L. C. Starling; L. Vincetti; E. Wickens; K. Wiersema

The Gravitational-wave Optical Transient Observer (GOTO) data pipeline and workflow for transient discovery

J. D. Lyman, D. O'Neill, T. Killestein, D. Jarvis, A. Kumar, K. Ulaczyk, K. Ackley, P. Chote, M. J. Dyer, M. Pursiainen, D. Steeghs, B. Godson, M. Magee, J. R. Mullaney, B. Warwick, S. Belkin, D. K. Galloway, G. Ramsay, V. S. Dhillon, P. O'Brien, K. Noysena, R. Kotak, R. P. Breton, L. K. Nuttall, B. Gompertz, D. Pollacco, J. Casares, D. L. Coppejans, R. A. J. Eyles-Ferris, O. Graur, L. Kelsey, M. R. Kennedy, A. Levan, S. Littlefair, S. Mandhai, D. Mata Sánchez, S. Mattila, J. McCormac, S. Moran, C. Phillips, K. Pu, A. Sahu, M. Shrestha, E. Stanway, R. L. C. Starling, L. Vincetti, E. Wickens, K. Wiersema

Abstract

Wide-field and high-cadence sky surveys are the first step in the chain of discovery and characterisation of astrophysical transients such as supernovae, kilonovae, and tidal disruption events, each linked to the varied demise of stellar systems. The Gravitational-wave Optical Transient Observer (GOTO) is a telescope array of thirty-two 40 cm unit telescopes split over two almost antipodal sites. It performs a regular time-domain sky-survey in the optical to ~20 mag in addition to immediate scheduling of follow-up observations at the locations of external multi-wavelength and -messenger triggers. To facilitate the timely recovery of optical counterparts to these triggers, as well as the presence of serendipitous discoveries of astrophysical transients in the regular sky-survey, a low-latency data pipeline and workflow was developed. The implementation of this workflow is described herein and the quality of GOTO data delivered by it assessed, alongside its performance for prompt transient recovery. Utilising difference image analysis to identify candidate discoveries, the process is typically complete ~7 minutes after shutter close on the telescope. We further describe later processing of these candidates -- both automated and human-in-the-loop -- including reporting to the wider community and the triggering of more detailed observations, with a focus on immediate, intra-night characterisation. The workflow is meeting the needs of GOTO to promptly discover, report and characterise infant transients. Nevertheless, areas for further development and improvements are also highlighted.

The Gravitational-wave Optical Transient Observer (GOTO) data pipeline and workflow for transient discovery

Abstract

Paper Structure (56 sections, 5 equations, 19 figures, 4 tables)

This paper contains 56 sections, 5 equations, 19 figures, 4 tables.

Introduction
GOTO Overview
Observing strategy
Hardware Implementation
Pipeline implementation
Transferring raw data from the observatories
Apache Airflow: Orchestrating kadmilos
DAG details
DAG versioning
Overall Impressions
PostgreSQL: Persistent data products storage
Basic raw data processing
Pre-reduction corrections
Generation of super calibration frames
Identifying and correcting column traps
...and 41 more sections

Figures (19)

Figure 1: A schematic showing the data flow and software implementation of the GOTO transient discovery workflow. Full details are given in the respective sections. Succinctly, data acquired and written by the GOTO mounts to a local file system at the respective observatory dyer_thesis. A rawtransfer service (\ref{['sec:rawtransfer']}) at each site monitors these local file systems, packages the new FITS files as they are created and transfers them to a large capacity file system at the University of Warwick, writing the details of the transfer to a record in the transfer database, also held in Warwick. The insertion of the record emits a NOTIFY signal which tells the Airflow (\ref{['sec:airflow']}) scheduler that a new raw insertion DAG (\ref{['sec:dagdetails']} should be scheduled to process the file. The tasks of the kadmilos pipeline (e.g. \ref{['sec:single_image_generation']}) are performed by a cluster of Celery workers, scheduled by Airflow to follow the respective DAG's logic. As part of these tasks, the workers write processed data and results to a central filesystem and database. These central data resources are pulled by the GOTO marshall (\ref{['sec:marshall']}), where additional metadata such as contextual information is fetched, and source associations are made. The result source data is then visualised, vetted, and reported by collaboration members via a webserver.
Figure 2: Column-wise traces of pixel value in deep-stacked single images of a GOTO camera highlighting various column-trap behaviour. Each panel shows the trace for a visually identified bad column (dark blue), alongside the trace for the neighbouring, good, column (light blue). The top three panels show 'step' features in the traces of varying depths, with the second panel also showing some offset above that of the neighbouring good column. The bottom two panels show overall offsets from the good trace along the entire column length.
Figure 3: The use of a column-wise step-model to correct column charge trap defects in GOTO data. Left: A deep stack of 500 single science images prior to the correction. Middle: The step-model of the column traps following the procedure described in the text (\ref{['sec:columntraps']}). Right: Subtraction of the step-model from the uncorrected data showing the removal of almost all trap features. Very subtle and or small-length traps remain, visible due to aggressive visualisation scaling. The read-noise alone in a given GOTO image dominates over the amplitude of these features. Axis values relate to pixel coordinates and the colour scale employs a broken log-linear scheme to enhance contrast at the low background level.
Figure 4: Column-wise traces of pixel value in columntrap calibration frames showing the evolution of column-trap behaviour. Each panel shows the trace for the same columns as shown in \ref{['fig:column_trap_individual_bad_columns']}, with the colour indicating the date the calibration frame was created.
Figure 5: Density distributions of source HFD, calculated from medians in individual GOTO single images. Individual UT distributions are shown (thin coloured lines), along with the overall distribution per mount (thick dashed line). Density estimates were determined using a Gaussian kernel with bandwidth using the method of scott_kde. The distributions were initially filtered to remove the low number of images with median HFD values of $>15$ px. The more varied performance of GOTO-1 is reflective of a more aged collimation of that mount compared to the others.
...and 14 more figures

The Gravitational-wave Optical Transient Observer (GOTO) data pipeline and workflow for transient discovery

Abstract

The Gravitational-wave Optical Transient Observer (GOTO) data pipeline and workflow for transient discovery

Authors

Abstract

Table of Contents

Figures (19)