Table of Contents
Fetching ...

Common Foundations for SHACL, ShEx, and PG-Schema

S. Ahmetaj, I. Boneva, J. Hidders, K. Hose, M. Jakubowski, J. E. Labra-Gayo, W. Martens, F. Mogavero, F. Murlak, C. Okulmus, A. Polleres, O. Savkovic, M. Simkus, D. Tomaszuk

TL;DR

This paper develops a unified framework to compare SHACL, ShEx, and PG-Schema by introducing a Common Graph Data Model that embeds RDF and Property Graphs as a shared substrate. It formalizes non-recursive core components of each language, defines a Common Graph Schema Language (CoGSL) to capture shared functionalities, and discusses translations between the formalisms to enable interoperability. The work covers detailed foundations, per-language treatments on common graphs, and an extensive related-work survey, ultimately enabling cross-translation and unified understanding across graph-schema technologies. The practical impact is a principled basis for interoperable graph validation and schema design across heterogeneous graph data models, with future directions toward recursion in ShEx and richer PG-Schema capabilities.

Abstract

Graphs have emerged as an important foundation for a variety of applications, including capturing and reasoning over factual knowledge, semantic data integration, social networks, and providing factual knowledge for machine learning algorithms. To formalise certain properties of the data and to ensure data quality, there is a need to describe the schema of such graphs. Because of the breadth of applications and availability of different data models, such as RDF and property graphs, both the Semantic Web and the database community have independently developed graph schema languages: SHACL, ShEx, and PG-Schema. Each language has its unique approach to defining constraints and validating graph data, leaving potential users in the dark about their commonalities and differences. In this paper, we provide formal, concise definitions of the core components of each of these schema languages. We employ a uniform framework to facilitate a comprehensive comparison between the languages and identify a common set of functionalities, shedding light on both overlapping and distinctive features of the three languages.

Common Foundations for SHACL, ShEx, and PG-Schema

TL;DR

This paper develops a unified framework to compare SHACL, ShEx, and PG-Schema by introducing a Common Graph Data Model that embeds RDF and Property Graphs as a shared substrate. It formalizes non-recursive core components of each language, defines a Common Graph Schema Language (CoGSL) to capture shared functionalities, and discusses translations between the formalisms to enable interoperability. The work covers detailed foundations, per-language treatments on common graphs, and an extensive related-work survey, ultimately enabling cross-translation and unified understanding across graph-schema technologies. The practical impact is a principled basis for interoperable graph validation and schema design across heterogeneous graph data models, with future directions toward recursion in ShEx and richer PG-Schema capabilities.

Abstract

Graphs have emerged as an important foundation for a variety of applications, including capturing and reasoning over factual knowledge, semantic data integration, social networks, and providing factual knowledge for machine learning algorithms. To formalise certain properties of the data and to ensure data quality, there is a need to describe the schema of such graphs. Because of the breadth of applications and availability of different data models, such as RDF and property graphs, both the Semantic Web and the database community have independently developed graph schema languages: SHACL, ShEx, and PG-Schema. Each language has its unique approach to defining constraints and validating graph data, leaving potential users in the dark about their commonalities and differences. In this paper, we provide formal, concise definitions of the core components of each of these schema languages. We employ a uniform framework to facilitate a comprehensive comparison between the languages and identify a common set of functionalities, shedding light on both overlapping and distinctive features of the three languages.

Paper Structure

This paper contains 64 sections, 13 theorems, 52 equations, 4 figures, 11 tables.

Key Result

Proposition 1

For every common schema there exist equivalent SHACL and ShEx schemas.

Figures (4)

  • Figure 1: The media service common graph.
  • Figure 2: Two graphs indistinguishable by ShEx
  • Figure 3: A standard ShEx schema.
  • Figure 4: Abstract syntax for s-ShEx.

Theorems & Definitions (32)

  • Definition 1
  • Definition 2: Neighbourhood
  • Definition 3: Path Expression
  • Definition 4: SHACL Shape
  • Definition 5: SHACL Selector
  • Definition 6: shapes and triple expressions
  • Definition 7: ShEx Selectors
  • Definition 8: Content type
  • Definition 9: PG-path expressions
  • Definition 10: PG-Shapes
  • ...and 22 more