Consistent Query Answering over SHACL Constraints
Shqiponja Ahmetaj, Timo Camillo Merkl, Reinhard Pichler
TL;DR
This work investigates consistent query answering (CQA) over SHACL-constrained RDF graphs, focusing on minimal repairs and a fundamental SPARQL fragment (BGP and well-designed queries). It provides a detailed complexity landscape across four query languages, three inconsistency-tolerant semantics (brave, AR, IAR), and both data- and combined-complexity, including max-repairs. The results establish broad intractability: CQA variants span the first to the third levels of the polynomial hierarchy, with sharp upper and lower bounds and complete classifications for recursive and non-recursive SHACL. The work also extends the framework to max-repairs, revealing BH- and Theta2P-level complexities and outlining directions for practical algorithms and fragmentations of SHACL for tractable CQA.
Abstract
The Shapes Constraint Language (SHACL) was standardized by the World Wide Web as a constraint language to describe and validate RDF data graphs. SHACL uses the notion of shapes graph to describe a set of shape constraints paired with targets, that specify which nodes of the RDF graph should satisfy which shapes. An important question in practice is how to handle data graphs that do not validate the shapes graph. A solution is to tolerate the non-validation and find ways to obtain meaningful and correct answers to queries despite the non-validation. This is known as consistent query answering (CQA) and there is extensive literature on CQA in both the database and the KR setting. We study CQA in the context of SHACL for a fundamental fragment of the Semantic Web query language SPARQL. The goal of our work is a detailed complexity analysis of CQA for various semantics and possible restrictions on the acceptable repairs. It turns out that all considered variants of the problem are intractable, with complexities ranging between the first and third level of the polynomial hierarchy.
