Computational Complexity of Preferred Subset Repairs on Data-Graphs

Nina Pardal; Santiago Cifuentes; Edwin Pin; Maria Vanina Martinez; Sergio Abriola

Computational Complexity of Preferred Subset Repairs on Data-Graphs

Nina Pardal, Santiago Cifuentes, Edwin Pin, Maria Vanina Martinez, Sergio Abriola

TL;DR

The paper investigates prioritized subset repairs for data-graphs under GXPath-based integrity constraints, formalizing how different preorder criteria (including inclusion, cardinality, weights, and multisets) shape the space of repairs and the corresponding decision and query problems. It proves that repair notions extend the classical subset repairs and that, in the node-positive fragment of GXPath, a unique repair exists and can be computed in PTIME, yielding PTIME solvability for repair and CQA tasks in this regime. For the full GXPath language, the authors establish a comprehensive complexity map, placing repair existence in NP and repair checking in coNP, with CQA reaching $\Pi^p_2$-hard or $\Delta^p_2[\log n]$-level hardness depending on the exact language and preorder. Overall, the work delineates when efficient repair and cautious query answering are feasible and guides future work toward refined tractable variants and alternative consistency notions in graph-structured data systems.

Abstract

Preferences are a pivotal component in practical reasoning, especially in tasks that involve decision-making over different options or courses of action that could be pursued. In this work, we focus on repairing and querying inconsistent knowledge bases in the form of graph databases, which involves finding a way to solve conflicts in the knowledge base and considering answers that are entailed from every possible repair, respectively. Without a priori domain knowledge, all possible repairs are equally preferred. Though that may be adequate for some settings, it seems reasonable to establish and exploit some form of preference order among the potential repairs. We study the problem of computing prioritized repairs over graph databases with data values, using a notion of consistency based on GXPath expressions as integrity constraints. We present several preference criteria based on the standard subset repair semantics, incorporating weights, multisets, and set-based priority levels. We show that it is possible to maintain the same computational complexity as in the case where no preference criterion is available for exploitation. Finally, we explore the complexity of consistent query answering in this setting and obtain tight lower and upper bounds for all the preference criteria introduced.

Computational Complexity of Preferred Subset Repairs on Data-Graphs

TL;DR

-hard or

-level hardness depending on the exact language and preorder. Overall, the work delineates when efficient repair and cautious query answering are feasible and guides future work toward refined tractable variants and alternative consistency notions in graph-structured data systems.

Abstract

Paper Structure (13 sections, 16 theorems, 18 equations, 2 figures, 2 tables)

This paper contains 13 sections, 16 theorems, 18 equations, 2 figures, 2 tables.

Introduction
Preliminaries
Types of preferences
Computational problems related to preferred repairs
Complexity classes
Complexity of preferred repairs
Upper bounds / Membership
Lower bounds
Complexity of CQA
Upper bounds / Membership
Lower bounds
Conclusions and Future Work
Appendix

Key Result

Lemma 2

Given a data-graph $G$ and an expression $\eta \in \text{GXPath}\xspace$, there is an algorithm that computes the set $\llbracket{\eta}\rrbracket_G$ in polynomial time on the size of $G$ and $\eta$. As a consequence, we can decide whether $G \models R$ in polynomial time on the size of $G$ and $R$.

Figures (2)

Figure 1: A film data-graph.
Figure 2: (a) A data-graph that does not satisfy the constraints from Example \ref{['example:WeightsAndRepairs']} (for example, $({\textsc{\footnotesize{B}}}, {\textsc{\footnotesize{D}}}) \notin \llbracket{ \alpha\xspace_{hll\rightarrow rn} }\rrbracket$). (b) A subset data-graph (which is a subset repair) of the example of figure (a) that satisfies the constraints but is not a $w$-preferred subset repair; the associated weight of this repair is $w(G) - 3$ (from one high edge). (c) A $w$-preferred subset repair; the associated weight of this repair is $w(G) - 2$ (from two deleted low edges).

Theorems & Definitions (33)

Definition 1: Consistency
Lemma 2: libkin2016querying, Theorem 4.3
Example 1
Definition 3: Prioritized subset repairs
Definition 4: Multisets
Definition 5
Example 2
Example 3
Example 4: Cont. Example \ref{['example:WeightsAndRepairs']}
Lemma 6
...and 23 more

Computational Complexity of Preferred Subset Repairs on Data-Graphs

TL;DR

Abstract

Computational Complexity of Preferred Subset Repairs on Data-Graphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (33)