Table of Contents
Fetching ...

Tools and Advancements towards Data Standardization of the MAGIC Collaboration

C. P. Walther, C. Nigro, D. Elsässer, W. Rhode

TL;DR

This work addresses the challenge of long-term, interoperable data stewardship in very-high-energy gamma-ray astronomy by implementing a GADF-aligned DL3 data format for MAGIC data and introducing two open-source tools: magic_dl3, which converts proprietary MAGIC data into standardized DL3 with a Gamma-ray Instrument Response Function, and autoMAGIC, a database-driven automation framework that produces reproducible, scalable analysis configurations. The authors validate these tools against the established MAGIC MARS pipeline and Gammapy using Crab Nebula and Mrk421 datasets, demonstrating good agreement in counts, IRFs, spectra, and light curves, even under varying observing conditions such as moonlight. The results illustrate that standardized DL3 data can be analyzed with existing tools while preserving traceability and reproducibility, supporting legacy data use and future open observatory workflows. Overall, the work provides a practical path toward large-scale standardized data production and long-term data preservation in VHE gamma-ray astronomy, with broad implications for interoperability and FAIR data principles. The introduced automation reduces human error and accelerates data production, enabling more efficient data reuse and multi-messenger collaborations.

Abstract

Gamma-ray astronomy is able to acquire large data volumes that astronomers use to draw scientific conclusions from. Ensuring the possibility of accessing and utilizing this data also after the lifetime of currently running experiments requires the use of a standardized data format. Following the data standardization format proposed by the gamma-ray astronomy community, we present 104 h of the first production of 166 h of data from the MAGIC Imaging Air Cherenkov Telescopes in standardized data format. Six datasets were processed from which three are presented, all of which have been analyzed and validated through comparison using the open-source software Gammapy and the MAGIC analysis software MARS. Furthermore, looking towards a large-scale production of standardized data and a legacy of the data taken by the MAGIC experiment, we have developed and implemented the automated database-driven MAGIC data reduction tool autoMAGIC which offers a reliable and reproducible way to produce high-level datasets. By utilizing the automatization of parameter configuration choices, the software allows for a reduction of human error as well as an acceleration in the production of standardized data. Here, we also show comparable results for data processed with manual and automatic methods.

Tools and Advancements towards Data Standardization of the MAGIC Collaboration

TL;DR

This work addresses the challenge of long-term, interoperable data stewardship in very-high-energy gamma-ray astronomy by implementing a GADF-aligned DL3 data format for MAGIC data and introducing two open-source tools: magic_dl3, which converts proprietary MAGIC data into standardized DL3 with a Gamma-ray Instrument Response Function, and autoMAGIC, a database-driven automation framework that produces reproducible, scalable analysis configurations. The authors validate these tools against the established MAGIC MARS pipeline and Gammapy using Crab Nebula and Mrk421 datasets, demonstrating good agreement in counts, IRFs, spectra, and light curves, even under varying observing conditions such as moonlight. The results illustrate that standardized DL3 data can be analyzed with existing tools while preserving traceability and reproducibility, supporting legacy data use and future open observatory workflows. Overall, the work provides a practical path toward large-scale standardized data production and long-term data preservation in VHE gamma-ray astronomy, with broad implications for interoperability and FAIR data principles. The introduced automation reduces human error and accelerates data production, enabling more efficient data reuse and multi-messenger collaborations.

Abstract

Gamma-ray astronomy is able to acquire large data volumes that astronomers use to draw scientific conclusions from. Ensuring the possibility of accessing and utilizing this data also after the lifetime of currently running experiments requires the use of a standardized data format. Following the data standardization format proposed by the gamma-ray astronomy community, we present 104 h of the first production of 166 h of data from the MAGIC Imaging Air Cherenkov Telescopes in standardized data format. Six datasets were processed from which three are presented, all of which have been analyzed and validated through comparison using the open-source software Gammapy and the MAGIC analysis software MARS. Furthermore, looking towards a large-scale production of standardized data and a legacy of the data taken by the MAGIC experiment, we have developed and implemented the automated database-driven MAGIC data reduction tool autoMAGIC which offers a reliable and reproducible way to produce high-level datasets. By utilizing the automatization of parameter configuration choices, the software allows for a reduction of human error as well as an acceleration in the production of standardized data. Here, we also show comparable results for data processed with manual and automatic methods.

Paper Structure

This paper contains 5 sections, 1 equation, 7 figures.

Figures (7)

  • Figure 1: A structural float chart of the MAGIC data levels illustrating the visualized data in each step of the data analysis. Blue boxes describe the levels of data analysis with magic_dl3 highlighted as the tool to produce standardized data following the GADF guidelines. Below the blue boxes, bold black terms describe MAGIC subprogram executables performed utilizing MARS. White boxes present the configuration input that is given to autoMAGIC specifying the analysis characteristics.DL3
  • Figure 2: Description of the autoMAGIC operation procedure. Data that characterizes the analysis is inserted into the database, which stores respective information, by requesting it from the PIC cluster. The PIC cluster draws the information from the PIC File System and returns it to the database. Jobs are created and inserted into the database based on the available data in the PIC File System and the existence of jobs in the database mirroring the analysis configuration of requested jobs. Jobs are submitted based on their status in the database and based on the workload of the PIC Cluster. JL
  • Figure 3: Comparison of the DL3 components as outlined in section \ref{['sec:2']}. The most left plot depicts the histogram of counts in the on-region with the second left plot depicting the histogram of counts in the off-region. The most right plot depicts the bias and the resolution of the energy dispersion while the second right plot depicts the effective area. All plots include a ratio demonstrating the dispersion of both methods.DL3
  • Figure 4: Obtained SEDs of the Crab Nebula with the left-side plot showing the SED of a $0.4°$ wobble-offset. The six subplots on the right side depict the SED depending on the wobble-offset ranging from $0.2°$ in the top-left to $1.4°$ in the bottom right. In all plots, red describes the novel pipeline using magic_dl3 and Gammapy, black using the proprietary MARS approach, and blue showing the reference value.DL3
  • Figure 5: The light curve over the whole $42h$ data set. Therein, the run-wise light curve is shown in transparent points and the weekly lightcurve is shown in solid points. The blue dots represent the reference from MAGIC. aleksicDL3
  • ...and 2 more figures