Complete Approximations of Incomplete Queries
Julien Corman, Werner Nutt, Ognjen Savković
TL;DR
The paper investigates the completeness of conjunctive queries over partially complete databases using table-completeness statements (TCS). It introduces two complementary approximation strategies: Minimal Complete Generalizations (MCG) computed via a least fixed-point of a monotone operator $G_{oldsymbol{C}}$, and Maximal Complete Specializations (MCS) guided by complete unifiers and MCIs. The work establishes complexity results (e.g., DP-complete for checking $Q' = G_{oldsymbol{C}}(Q)$ and a $P^{NP}$-style approach for computing MCGs) and develops algorithms to obtain complete approximations, including finite bounding through acyclic dependency graphs and $k$-MCSs. It also discusses practical implementations using ASP/Datalog and Prolog, with plans to extend the theory to integrity constraints such as keys and finite domains, enhancing applicability to real-world incomplete data problems.
Abstract
This paper studies the completeness of conjunctive queries over a partially complete database and the approximation of incomplete queries. Given a query and a set of completeness rules (a special kind of tuple generating dependencies) that specify which parts of the database are complete, we investigate whether the query can be fully answered, as if all data were available. If not, we explore reformulating the query into either Maximal Complete Specializations (MCSs) or the (unique up to equivalence) Minimal Complete Generalization (MCG) that can be fully answered, that is, the best complete approximations of the query from below or above in the sense of query containment. We show that the MSG can be characterized as the least fixed-point of a monotonic operator in a preorder. Then, we show that an MCS can be computed by recursive backward application of completeness rules. We study the complexity of both problems and discuss implementation techniques that rely on an ASP and Prolog engines, respectively.
