Expressiveness of SHACL Features and Extensions for Full Equality and Disjointness Tests

Bart Bogaerts; Maxime Jakubowski; Jan Van den Bussche

Expressiveness of SHACL Features and Extensions for Full Equality and Disjointness Tests

Bart Bogaerts, Maxime Jakubowski, Jan Van den Bussche

TL;DR

This work analyzes the expressive power of SHACL features for RDF graph constraints, focusing on equality ($eq$), disjointness ($disj$), and closure ($closed$) constraints and their interaction with target-based shape schemas. The authors show each feature is primitive by constructing targeted boolean queries that require the feature, and they prove that removing any feature reduces expressiveness unless left-hand sides are restricted to targets and closure is absent. They also demonstrate that enabling full versions of equality and disjointness further increases expressive power, and extend the results to stratified recursion. The findings clarify when SHACL’s target-based restrictions are essential and how extensions could broaden SHACL’s descriptive capabilities, with implications for schema design and reasoning in RDF data management.

Abstract

SHACL is a W3C-proposed schema language for expressing structural constraints on RDF graphs. Recent work on formalizing this language has revealed a striking relationship to description logics. SHACL expressions can use three fundamental features that are not so common in description logics. These features are equality tests; disjointness tests; and closure constraints. Moreover, SHACL is peculiar in allowing only a restricted form of expressions (so-called targets) on the left-hand side of inclusion constraints. The goal of this paper is to obtain a clear picture of the impact and expressiveness of these features and restrictions. We show that each of the four features is primitive: using the feature, one can express boolean queries that are not expressible without using the feature. We also show that the restriction that SHACL imposes on allowed targets is inessential, as long as closure constraints are not used. In addition, we show that enriching SHACL with "full" versions of equality tests, or disjointness tests, results in a strictly more powerful language.

Expressiveness of SHACL Features and Extensions for Full Equality and Disjointness Tests

TL;DR

This work analyzes the expressive power of SHACL features for RDF graph constraints, focusing on equality (

), disjointness (

), and closure (

) constraints and their interaction with target-based shape schemas. The authors show each feature is primitive by constructing targeted boolean queries that require the feature, and they prove that removing any feature reduces expressiveness unless left-hand sides are restricted to targets and closure is absent. They also demonstrate that enabling full versions of equality and disjointness further increases expressive power, and extend the results to stratified recursion. The findings clarify when SHACL’s target-based restrictions are essential and how extensions could broaden SHACL’s descriptive capabilities, with implications for schema design and reasoning in RDF data management.

Abstract

Paper Structure (18 sections, 20 theorems, 12 equations, 4 figures, 4 tables)

This paper contains 18 sections, 20 theorems, 12 equations, 4 figures, 4 tables.

Introduction
Shape schemas
Shapes
Graphs and their interpretation
Targets and shape schemas
Expressiveness of SHACL features
Preliminaries on path expressions
Disjointness
Equality
Closure
Are target-based shape schemas enough?
Extensions for full equality and disjointness tests
Full equality
Full disjointness
Further non-definability results
...and 3 more sections

Key Result

Theorem 3.1

Let $X \in\{\mathit{eq},\mathit{disj},\mathit{closed}\}$ and let $F$ be a feature set with $X \notin F$. Then $Q_X$ is not definable in $\mathcal{L}(F)$.

Figures (4)

Figure 1: An example graph $G_\mathit{ex}$
Figure 2: Graphs used to prove Proposition \ref{['deprop']}. The nodes are taken outside $\Sigma$. For $X=\mathit{eq}$, the cloud shown for $G'$ represents a complete directed graph on $m+1$ nodes, with self-loops, and $G$ is the same graph with one directed edge removed. For $X=\mathit{disj}$, in the picture for $G$, each cloud again stands for a complete graph, but this time on $M=\max(m,3)$ nodes, and without the self-loops. Each oval stands for a set of $M$ separate nodes. An arrow from one blob to the next means that every node of the first blob has a directed edge to every node of the next blob. So, $G$ is a directed 4-cycle of alternating clouds and ovals, and $G'$ is a directed 4-cycle of clouds.
Figure 3: Illustration of the $p$ and $q$ relations in graphs $G=G_\mathit{full\text{-}disj}(\Sigma,m)$ and $G'=G'_\mathit{full\text{-}disj}(\Sigma,m)$
Figure 4: Illustration of the $p^-$ and $q^-$ relations in graphs $G=G_\mathit{full\text{-}disj}(\Sigma,m)$ and $G'=G'_\mathit{full\text{-}disj}(\Sigma,m)$

Theorems & Definitions (59)

Remark 2.1
Remark 2.2
Remark 2.3
Example 2.4
Remark 2.5
Example 2.6: Example \ref{['ex:running1']} continued
Example 2.7: Example \ref{['ex:running2']} continued
Theorem 3.1
Proposition 3.2
Lemma 3.3
...and 49 more

Expressiveness of SHACL Features and Extensions for Full Equality and Disjointness Tests

TL;DR

Abstract

Expressiveness of SHACL Features and Extensions for Full Equality and Disjointness Tests

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (59)