Semantic Model for the SKA Regional Centre Network
Edgar Ribeiro João, Manuel Parra-Royón, Julián Garrido
TL;DR
The paper tackles the challenge of describing the SKA Regional Centre Network (SRCNet) for scalable data management and analysis by introducing a JSON-LD–based semantic framework. It defines a dual static-dynamic model to capture persistent architecture and time-dependent performance, data placement, and resource utilization, enabling interoperable resource orchestration and potential service brokering. The model is instantiated and validated using SPARQL queries against RDF representations derived from JSON-LD, with competency questions demonstrating the framework's expressiveness. By hosting a FAIR-compatible semantic description and demonstrating conversion to RDF for queryable storage (e.g., in Apache Jena Fuseki), the work lays the groundwork for precise mathematical modeling and real-time workflow planning across SRCNet's distributed infrastructure, ultimately supporting efficient data processing for SKA science. The approach promises practical impact in interoperability, cost-aware data movement, and scalable orchestration of science gateways and compute resources in a globally distributed astronomical data ecosystem.
Abstract
The unprecedented volume of data from the Square Kilometre Array (SKA) telescopes will require the implementation of robust and solid strategies for efficient data processing and management. In this context, the SKA Regional Centre Network (SRCNet) -- a collaborative global infrastructure comprising multiple regional centres distributed across various geographical regions around the globe -- is poised to play a critical role. This network will be instrumental in facilitating the effective handling and analysis of extensive data streams generated by the telescopes, thereby enabling significant advancements in astronomical research and exploration. This paper introduces a semantic model implemented with JSON-LD designed specifically for the SRCNet, detailing its architecture, data distribution, and computing service. By explicitly defining nodes, resources, relationships, and workflows, this model lays a foundation for interoperability and efficient resource management within the distributed network. The model presented in this text supports two possible configurations: centralized and decentralized -- depending where data reside -- enabling a future service broker to efficiently plan workflows by querying nodes for real-time system availability. Consistency tests conducted using SPARQL queries were made on the model in order to validate and test its integrity. Therefore, this research contributes to the advancement of semantic modeling in astronomy by addressing the semantic model for the SRCNet, a topic that has not been previously explored. This semantic model serves as a precursor to the development of a precise mathematical representation of the network and establishes a foundational framework for a future service broker.
