Table of Contents
Fetching ...

Diversity of Answers to Conjunctive Queries

Timo Camillo Merkl, Reinhard Pichler, Sebastian Skritek

TL;DR

The paper studies Diverse-$\mathcal{Q}$: selecting a size-$k$ diversity set of answers to a CQ (and extensions) such that a diversity measure $\delta$ (built from pairwise Hamming distances via a polynomial-time aggregator $f$) reaches at least $d$. It provides a detailed parameterized complexity landscape across query classes: for acyclic CQs (ACQ) with diverse-set size $k$, Diverse-$\mathcal{ACQ}$ is in XP for combined complexity and becomes FPT in data complexity; a W[1]-hardness result holds for ws-monotone measures. Extensions to unions of CQs (UCQ/UACQ) and CQs with negation (CQ$^\neg$) show a mix of tractability and hardness, with Diverse-wsm-UACQ being NP-hard even for $k=2$, while Diverse-sum-ACQ remains FPT in query complexity and efficient in data settings. The work also outlines algorithmic strategies based on Yannakakis-style join-tree DP, and discusses structural width measures (treewidth, hypertree width) and their impact on tractability, as well as future directions like smw-bounded queries, beta-acyclicity, and approximation approaches. Overall, the results map the computational boundaries of producing diverse CQ-answer subsets, informing both exact and heuristic diversification approaches in practice.

Abstract

Enumeration problems aim at outputting, without repetition, the set of solutions to a given problem instance. However, outputting the entire solution set may be prohibitively expensive if it is too big. In this case, outputting a small, sufficiently diverse subset of the solutions would be preferable. This leads to the Diverse-version of the original enumeration problem, where the goal is to achieve a certain level d of diversity by selecting k solutions. In this paper, we look at the Diverse-version of the query answering problem for Conjunctive Queries and extensions thereof. That is, we study the problem if it is possible to achieve a certain level d of diversity by selecting k answers to the given query and, in the positive case, to actually compute such k answers.

Diversity of Answers to Conjunctive Queries

TL;DR

The paper studies Diverse-: selecting a size- diversity set of answers to a CQ (and extensions) such that a diversity measure (built from pairwise Hamming distances via a polynomial-time aggregator ) reaches at least . It provides a detailed parameterized complexity landscape across query classes: for acyclic CQs (ACQ) with diverse-set size , Diverse- is in XP for combined complexity and becomes FPT in data complexity; a W[1]-hardness result holds for ws-monotone measures. Extensions to unions of CQs (UCQ/UACQ) and CQs with negation (CQ) show a mix of tractability and hardness, with Diverse-wsm-UACQ being NP-hard even for , while Diverse-sum-ACQ remains FPT in query complexity and efficient in data settings. The work also outlines algorithmic strategies based on Yannakakis-style join-tree DP, and discusses structural width measures (treewidth, hypertree width) and their impact on tractability, as well as future directions like smw-bounded queries, beta-acyclicity, and approximation approaches. Overall, the results map the computational boundaries of producing diverse CQ-answer subsets, informing both exact and heuristic diversification approaches in practice.

Abstract

Enumeration problems aim at outputting, without repetition, the set of solutions to a given problem instance. However, outputting the entire solution set may be prohibitively expensive if it is too big. In this case, outputting a small, sufficiently diverse subset of the solutions would be preferable. This leads to the Diverse-version of the original enumeration problem, where the goal is to achieve a certain level d of diversity by selecting k solutions. In this paper, we look at the Diverse-version of the query answering problem for Conjunctive Queries and extensions thereof. That is, we study the problem if it is possible to achieve a certain level d of diversity by selecting k answers to the given query and, in the positive case, to actually compute such k answers.
Paper Structure (11 sections, 19 theorems, 49 equations, 1 figure)

This paper contains 11 sections, 19 theorems, 49 equations, 1 figure.

Key Result

Theorem 3.2

The Diverse-ACQ problem is in $\mathsf{XP}$in combined complexity when parameterized by the size $k$ of the diversity set. More specifically, for an ACQ $Q(X)$, a database $I$, and integers $k$ and $d$, algo:one decides the Diverse-ACQ problem in time $\mathcal{O}(|R^I|^{2k} \cdot (|X| + 1)^{k(k-1)}

Figures (1)

  • Figure 1: Example Execution of the Basic Algorithm.

Theorems & Definitions (39)

  • Theorem 3.2
  • Lemma 3.3
  • proof
  • Lemma 3.4
  • proof
  • Lemma 3.5
  • proof
  • Lemma 3.6
  • proof
  • proof : Proof (of \ref{['theorem:acqXP']})
  • ...and 29 more