Query Repairs
Balder ten Cate, Phokion Kolaitis, Carsten Lutz
TL;DR
The paper addresses repairing database queries using user-labeled examples by introducing a general framework based on a proximity pre-order $\preceq$, where a repair is a minimal-change CQ that fits the data. It specializes this framework to conjunctive queries with two pre-orders: $\preceq^{\text{cod}}$ based on containment of differences and $\preceq^{\text{edit-dist}}$ based on an edit-distance over cores, and analyzes repairs, generalizations, and specializations under each. Key findings include: (i) $\preceq^{\text{cod}}$-generalizations are typically unique (when they exist) while specializations may not, and repairs can be infinite; (ii) $\preceq^{\text{edit-dist}}$ yields a non-empty, finite set of repairs and related notions with concrete complexity results. The work relates query repairs to extremal fitting and provides algorithmic problems for verification, existence, and construction within this framework, with connections to broader repair and learning literature. The approach enables principled, minimal modifications to CQ queries guided by labeled feedback, with potential impact on interactive data cleaning and query refinement.
Abstract
We formalize and study the problem of repairing database queries based on user feedback in the form of a collection of labeled examples. We propose a framework based on the notion of a proximity pre-order, and we investigate and compare query repairs for conjunctive queries (CQs) using different such pre-orders. The proximity pre-orders we consider are based on query containment and on distance metrics for CQs.
