Complete Approximations of Incomplete Queries

Julien Corman; Werner Nutt; Ognjen Savković

Complete Approximations of Incomplete Queries

Julien Corman, Werner Nutt, Ognjen Savković

TL;DR

The paper investigates the completeness of conjunctive queries over partially complete databases using table-completeness statements (TCS). It introduces two complementary approximation strategies: Minimal Complete Generalizations (MCG) computed via a least fixed-point of a monotone operator $G_{oldsymbol{C}}$, and Maximal Complete Specializations (MCS) guided by complete unifiers and MCIs. The work establishes complexity results (e.g., DP-complete for checking $Q' = G_{oldsymbol{C}}(Q)$ and a $P^{NP}$-style approach for computing MCGs) and develops algorithms to obtain complete approximations, including finite bounding through acyclic dependency graphs and $k$-MCSs. It also discusses practical implementations using ASP/Datalog and Prolog, with plans to extend the theory to integrity constraints such as keys and finite domains, enhancing applicability to real-world incomplete data problems.

Abstract

This paper studies the completeness of conjunctive queries over a partially complete database and the approximation of incomplete queries. Given a query and a set of completeness rules (a special kind of tuple generating dependencies) that specify which parts of the database are complete, we investigate whether the query can be fully answered, as if all data were available. If not, we explore reformulating the query into either Maximal Complete Specializations (MCSs) or the (unique up to equivalence) Minimal Complete Generalization (MCG) that can be fully answered, that is, the best complete approximations of the query from below or above in the sense of query containment. We show that the MSG can be characterized as the least fixed-point of a monotonic operator in a preorder. Then, we show that an MCS can be computed by recursive backward application of completeness rules. We study the complexity of both problems and discuss implementation techniques that rely on an ASP and Prolog engines, respectively.

Complete Approximations of Incomplete Queries

TL;DR

, and Maximal Complete Specializations (MCS) guided by complete unifiers and MCIs. The work establishes complexity results (e.g., DP-complete for checking

and a

-style approach for computing MCGs) and develops algorithms to obtain complete approximations, including finite bounding through acyclic dependency graphs and

-MCSs. It also discusses practical implementations using ASP/Datalog and Prolog, with plans to extend the theory to integrity constraints such as keys and finite domains, enhancing applicability to real-world incomplete data problems.

Abstract

Paper Structure (16 sections, 22 theorems, 21 equations, 1 table, 4 algorithms)

This paper contains 16 sections, 22 theorems, 21 equations, 1 table, 4 algorithms.

Introduction
Query Completeness
Query Generalization
Generalization Algorithm
Query Specialization
Maximal Complete Instantiations
Adding Atoms
Implementation
Implementing Generalization
Implementing Specialization
Conclusion
The Complexity of Identifying MCGs
The Number of MCSs for Acyclic Sets of TC Statements
Complete Unifiers Make Queries Complete
Maximal Instantiations are Produced by Complete Unifiers
...and 1 more sections

Key Result

Proposition 1.2

Let $\mathbf{C}$ be a set of TC statements. Then

Theorems & Definitions (45)

Example 1.1
Proposition 1.2: ${T_\mathbf{C}}$ Operator CIKM2015-Nutt
Theorem 1.3: Characterization of Completeness CIKM2015-Nutt
Example 1.4
Example 1.5: Complete Generalizations and Specializations
Proposition 1.6: Characterization of Containment
Definition 1.7: Minimal Complete Generalization
Proposition 1.8: MCGs are Subqueries
Lemma 1.9: Completeness of Minimal Queries
proof
...and 35 more

Complete Approximations of Incomplete Queries

TL;DR

Abstract

Complete Approximations of Incomplete Queries

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (45)