Table of Contents
Fetching ...

Ranked Enumeration of Conjunctive Query Results

Shaleen Deep, Paraschos Koutris

TL;DR

This work tackles ranked enumeration of full Conjunctive Queries (CQs) over relational databases, addressing the inefficiencies of materializing and sorting all results. It introduces ranking function properties—decomposable and compatible—with respect to query decompositions, and proves a main theorem that yields preprocessing time $O(|D|^{fhw})$ and enumeration delay $O(\log|D|)$ when rank is compatible. The framework supports several ranking families (vertex-based, tuple-based, lexicographic, and bounded) and extends to UCQs and submodular-width-based improvements, while establishing lower bounds that delineate the necessity of structure in the ranking function. The results advance ranked-enumeration theory and offer practical implications for scalable top-$k$ CQ evaluation, with potential extensions to dynamic settings and broader query families.

Abstract

We study the problem of enumerating answers of Conjunctive Queries ranked according to a given ranking function. Our main contribution is a novel algorithm with small preprocessing time, logarithmic delay, and non-trivial space usage during execution. To allow for efficient enumeration, we exploit certain properties of ranking functions that frequently occur in practice. To this end, we introduce the notions of {\em decomposable} and {\em compatible} (w.r.t. a query decomposition) ranking functions, which allow for partial aggregation of tuple scores in order to efficiently enumerate the output. We complement the algorithmic results with lower bounds that justify why restrictions on the structure of ranking functions are necessary. Our results extend and improve upon a long line of work that has studied ranked enumeration from both a theoretical and practical perspective.

Ranked Enumeration of Conjunctive Query Results

TL;DR

This work tackles ranked enumeration of full Conjunctive Queries (CQs) over relational databases, addressing the inefficiencies of materializing and sorting all results. It introduces ranking function properties—decomposable and compatible—with respect to query decompositions, and proves a main theorem that yields preprocessing time and enumeration delay when rank is compatible. The framework supports several ranking families (vertex-based, tuple-based, lexicographic, and bounded) and extends to UCQs and submodular-width-based improvements, while establishing lower bounds that delineate the necessity of structure in the ranking function. The results advance ranked-enumeration theory and offer practical implications for scalable top- CQ evaluation, with potential extensions to dynamic settings and broader query families.

Abstract

We study the problem of enumerating answers of Conjunctive Queries ranked according to a given ranking function. Our main contribution is a novel algorithm with small preprocessing time, logarithmic delay, and non-trivial space usage during execution. To allow for efficient enumeration, we exploit certain properties of ranking functions that frequently occur in practice. To this end, we introduce the notions of {\em decomposable} and {\em compatible} (w.r.t. a query decomposition) ranking functions, which allow for partial aggregation of tuple scores in order to efficiently enumerate the output. We complement the algorithmic results with lower bounds that justify why restrictions on the structure of ranking functions are necessary. Our results extend and improve upon a long line of work that has studied ranked enumeration from both a theoretical and practical perspective.

Paper Structure

This paper contains 34 sections, 20 theorems, 17 equations, 3 figures, 5 algorithms.

Key Result

Proposition 2.6

Let $\textnormal{rank}$ be a ranking function over a set of variables $\mathcal{V}$, and $T \subseteq \mathcal{V}$. If $\textnormal{rank}$ is $T$-decomposable, then it is also $T$-decomposable conditioned on $S$ for any $S \subseteq \mathcal{V} \setminus T$.

Figures (3)

  • Figure 1: Preprocessing and enumeration phase for Example \ref{['ex:intro']}. Each cell is assigned a memory addressed (written next to the cell). Pointers in cells are populated with the memory address of the cell they are pointing to. Cells are color coded according to the bag (white for root bag, blue for $\mathcal{B}_{2}$, orange for $\mathcal{B}_3$ and olive for $\mathcal{B}_4$.)
  • Figure 2: Database instance $D$ for the 4-path query. Each edge is color coded by the relation it belongs to. Values over the edges denote the weight assigned to each tuple.
  • Figure 3: Query decomposition example with depth more than one.

Theorems & Definitions (46)

  • Example 1.1
  • Example 2.1
  • Example 2.2
  • Definition 2.3: Decomposable Ranking
  • Example 2.4
  • Definition 2.5
  • Proposition 2.6
  • proof
  • Definition 2.7: Compatible Ranking
  • Example 2.8
  • ...and 36 more