Table of Contents
Fetching ...

Large Data Acquisition and Analytics at Synchrotron Radiation Facilities

Aashish Panta, Giorgio Scorzelli, Amy A. Gooch, Werner Sun, Katherine S. Shanks, Suchismita Sarker, Devin Bougie, Keara Soloway, Rolf Verberg, Tracy Berman, Glenn Tarcea, John Allison, Michela Taufer, Valerio Pascucci

TL;DR

This work addresses the challenge of managing and making real-time sense of terabytes-to-petabytes of synchrotron data under tight beamtime constraints. It introduces a modular web-based framework anchored by the NSDF EntryPoint, and two dashboards for data acquisition and data evaluation that enable remote, real-time monitoring, quality control, and data-driven decision making. Deployed on CHESS beamlines ID3A, ID3B, and ID4B and tested with 43 research groups, the system processed 50–100 TB and over 10 million files by late 2024, demonstrating scalability, accessibility, and workflow improvements. The approach, built on OpenVisus and NSDF integration, offers a transferable solution for other facilities to enhance scientific productivity and collaboration at scale.

Abstract

Synchrotron facilities like the Cornell High Energy Synchrotron Source (CHESS) generate massive data volumes from complex beamline experiments, but face challenges such as limited access time, the need for on-site experiment monitoring, and managing terabytes of data per user group. We present the design, deployment, and evaluation of a framework that addresses CHESS's data acquisition and management issues. Deployed on a secure CHESS server, our system provides real time, web-based tools for remote experiment monitoring and data quality assessment, improving operational efficiency. Implemented across three beamlines (ID3A, ID3B, ID4B), the framework managed 50-100 TB of data and over 10 million files in late 2024. Testing with 43 research groups and 86 dashboards showed reduced overhead, improved accessibility, and streamlined data workflows. Our paper highlights the development, deployment, and evaluation of our framework and its transformative impact on synchrotron data acquisition.

Large Data Acquisition and Analytics at Synchrotron Radiation Facilities

TL;DR

This work addresses the challenge of managing and making real-time sense of terabytes-to-petabytes of synchrotron data under tight beamtime constraints. It introduces a modular web-based framework anchored by the NSDF EntryPoint, and two dashboards for data acquisition and data evaluation that enable remote, real-time monitoring, quality control, and data-driven decision making. Deployed on CHESS beamlines ID3A, ID3B, and ID4B and tested with 43 research groups, the system processed 50–100 TB and over 10 million files by late 2024, demonstrating scalability, accessibility, and workflow improvements. The approach, built on OpenVisus and NSDF integration, offers a transferable solution for other facilities to enhance scientific productivity and collaboration at scale.

Abstract

Synchrotron facilities like the Cornell High Energy Synchrotron Source (CHESS) generate massive data volumes from complex beamline experiments, but face challenges such as limited access time, the need for on-site experiment monitoring, and managing terabytes of data per user group. We present the design, deployment, and evaluation of a framework that addresses CHESS's data acquisition and management issues. Deployed on a secure CHESS server, our system provides real time, web-based tools for remote experiment monitoring and data quality assessment, improving operational efficiency. Implemented across three beamlines (ID3A, ID3B, ID4B), the framework managed 50-100 TB of data and over 10 million files in late 2024. Testing with 43 research groups and 86 dashboards showed reduced overhead, improved accessibility, and streamlined data workflows. Our paper highlights the development, deployment, and evaluation of our framework and its transformative impact on synchrotron data acquisition.
Paper Structure (22 sections, 7 figures)

This paper contains 22 sections, 7 figures.

Figures (7)

  • Figure 1: Our remote data acquisition and evaluation framework. Traditionally, scientists interact with the physical display of the beamline station computer to collect and monitor data. With our framework’s EntryPoints deployment, data is now acquired, visualized, accessed, and able to be analyzed remotely, significantly enhancing experimental flexibility and efficiency.
  • Figure 2: Our web-based framework revolutionizes synchrotron data acquisition and analysis by enabling remote monitoring, real-time data quality assessment, and improved operational efficiency, significantly enhancing scientific productivity and on-site or remote collaboration at large-scale facilities like CHESS. Our acquisition statistics dashboard features widgets that reveal previously hidden metadata to beamline user groups and enable real-time progress monitoring for on- and off-site user groups. (a) shows a timeline widget that enables zooming into specific event ranges to identify unexpected drops in file counts, highlighting interruptions in data collection, such as the beam outage at 3:30 AM on November 14, 2023. (b) shows our probe dashboard, comparing individual radiographs within a rotation series (left) and 5-probe view (right) for an arbitrary specimen. Each probe is configured to sample minimum and maximum in an 8x8 pixel sample. The main view shows the XY-plane at $Z=250$. The graph includes insets of the slices for illustrative purposes, including $Z=250$, where the green probe reaches a local minimum, and $Z=200$, where the image is lost due to intermittent intensity fluctuations.
  • Figure 3: Example statistics dashboard: (a) file extension distribution, (b) experiment timeline, and (c) timeline of far-field (ff) and near-field (nf) scans.
  • Figure 4: The data evaluation and visualization dashboard, iteratively developed with feedback from CHESS staff scientists, for evaluating data by slice of the volume (left). Advanced features in the probe panel of the dashboard (right) enable users to examine data through the volume at key points of interest. One can dynamically change the view by moving the graph line or points in the slice view. The probes can also be changed to provide average, maximum, or minimum within a window of [1,8] pixels in either direction.
  • Figure 5: Top: plot of scan number as a function of time. Annotations have been added to aid in interpretation. Bottom: Plots using our stats dashboard of mechanical load versus tension motor position for tensile loading of specimen Mg-1Y (left) and compressive loading of specimen Mg-3Zn-0.1Ca (right).
  • ...and 2 more figures