Table of Contents
Fetching ...

Goal-Driven Query Answering over First- and Second-Order Dependencies with Equality

Efthymia Tsamoura, Boris Motik

TL;DR

This work tackles the challenge of efficiently answering queries over data with rich dependencies, including first- and second-order dependencies with equality. It introduces a goal-driven pipeline that transforms dependencies to suppress irrelevant inferences, combining a refined singularisation, a relevance analysis, and a tailored magic-sets approach. The authors extend singularisation to second-order dependencies with functional reflexivity, adapt relevance analysis and magic sets to handle equality, and provide a practical workflow that converts second-order problems into solvable logic programs while preserving answers. Extensive experiments on first- and second-order benchmarks show substantial speedups over full chase computation, highlighting the practical impact for scalable query answering in complex dependency settings.

Abstract

Query answering over data with dependencies plays a central role in most applications of dependencies. The problem is commonly solved by using a suitable variant of the chase algorithm to compute a universal model of the dependencies and the data and thus explicate all knowledge implicit in the dependencies. After this preprocessing step, an arbitrary conjunctive query over the dependencies and the data can be answered by evaluating it the computed universal model. If, however, the query to be answered is fixed and known in advance, computing the universal model is often inefficient as many inferences made during this process can be irrelevant to a given query. In such cases, a goal-driven approach, which avoids drawing unnecessary inferences, promises to be more efficient and thus preferable in practice. In this paper we present what we believe to be the first technique for goal-driven query answering over first- and second-order dependencies with equality reasoning. Our technique transforms the input dependencies so that applying the chase to the output avoids many inferences that are irrelevant to the query. The transformation proceeds in several steps, which comprise the following three novel techniques. First, we present a variant of the singularisation technique by Marnette [60] that is applicable to second-order dependencies and that corrects an incompleteness of a related formulation by ten Cate et al. [74]. Second, we present a relevance analysis technique that can eliminate from the input dependencies that provably do not contribute to query answers. Third, we present a variant of the magic sets algorithm [19] that can handle second-order dependencies with equality reasoning. We also present the results of an extensive empirical evaluation, which show that goal-driven query answering can be orders of magnitude faster than computing the full universal model.

Goal-Driven Query Answering over First- and Second-Order Dependencies with Equality

TL;DR

This work tackles the challenge of efficiently answering queries over data with rich dependencies, including first- and second-order dependencies with equality. It introduces a goal-driven pipeline that transforms dependencies to suppress irrelevant inferences, combining a refined singularisation, a relevance analysis, and a tailored magic-sets approach. The authors extend singularisation to second-order dependencies with functional reflexivity, adapt relevance analysis and magic sets to handle equality, and provide a practical workflow that converts second-order problems into solvable logic programs while preserving answers. Extensive experiments on first- and second-order benchmarks show substantial speedups over full chase computation, highlighting the practical impact for scalable query answering in complex dependency settings.

Abstract

Query answering over data with dependencies plays a central role in most applications of dependencies. The problem is commonly solved by using a suitable variant of the chase algorithm to compute a universal model of the dependencies and the data and thus explicate all knowledge implicit in the dependencies. After this preprocessing step, an arbitrary conjunctive query over the dependencies and the data can be answered by evaluating it the computed universal model. If, however, the query to be answered is fixed and known in advance, computing the universal model is often inefficient as many inferences made during this process can be irrelevant to a given query. In such cases, a goal-driven approach, which avoids drawing unnecessary inferences, promises to be more efficient and thus preferable in practice. In this paper we present what we believe to be the first technique for goal-driven query answering over first- and second-order dependencies with equality reasoning. Our technique transforms the input dependencies so that applying the chase to the output avoids many inferences that are irrelevant to the query. The transformation proceeds in several steps, which comprise the following three novel techniques. First, we present a variant of the singularisation technique by Marnette [60] that is applicable to second-order dependencies and that corrects an incompleteness of a related formulation by ten Cate et al. [74]. Second, we present a relevance analysis technique that can eliminate from the input dependencies that provably do not contribute to query answers. Third, we present a variant of the magic sets algorithm [19] that can handle second-order dependencies with equality reasoning. We also present the results of an extensive empirical evaluation, which show that goal-driven query answering can be orders of magnitude faster than computing the full universal model.

Paper Structure

This paper contains 46 sections, 14 theorems, 42 equations, 7 figures, 3 tables, 7 algorithms.

Key Result

Proposition 4

For each generalised second-order dependency $\Sigma$, each base instance $B$, and each fact of the form $\mathsf{Q}(\mathbf{a})$, it holds that ${\{ \Sigma \} \cup B \models_\approx \mathsf{Q}(\mathbf{a})}$ if and only if ${\mathsf{fol}(\Sigma) \cup B \models_\approx \mathsf{Q}(\mathbf{a})}$.

Figures (7)

  • Figure 1: A Universal Model for the Second-Order Dependency $\Sigma^\mathit{ex}$ from Example \ref{['ex:run']}
  • Figure 2: Distribution of the times, and the numbers of derived facts and rules for Mat, Rel, Mag, and Rel+Mag on first-order scnearios
  • Figure 3: Distribution of the times and the numbers of derived facts for Mat, Rel, Mag, and Rel+Mag on second-order scnearios
  • Figure 4: Distribution of the numbers of rules for Mat, Mag, Rel, and Rel+Mag on second-order scnearios
  • Figure : $\mathsf{benchmarkGenerator}$
  • ...and 2 more figures

Theorems & Definitions (46)

  • Example 1: DBLP:journals/tods/FaginKPT05
  • Example 2
  • Definition 3
  • Proposition 4
  • proof
  • Example 5
  • Definition 6
  • Theorem 7
  • Example 8
  • Definition 9
  • ...and 36 more