Table of Contents
Fetching ...

Approximately Counting Answers to Conjunctive Queries with Disequalities and Negations

Jacob Focke, Leslie Ann Goldberg, Marc Roth, Stanislav Živný

TL;DR

The paper tackles the problem of approximately counting answers to extended conjunctive queries (ECQs), including disequalities and negations, in large databases. It provides a complete parameterised classification: an FPTRAS is achievable for ECQs with bounded treewidth and arity, and for DCQs in the unbounded-arity setting when adaptive width is bounded; it also shows that no FPRAS exists in general unless NP = RP, even under strong width restrictions, while giving an FPRAS for CQ when the underlying hypergraphs have bounded fractional hypertreewidth. The methodology hinges on reducing counting to hyperedge counting via decision oracles, employing colour-coding to handle quantified variables, and leveraging tree automata for the $fhw$-bounded regime; these techniques also yield an FPTRAS for locally injective homomorphisms. The results advance both the theory and practical toolbox for approximate query counting, with implications for sampling and unions of queries and for broader classes of width measures.

Abstract

We study the complexity of approximating the number of answers to a small query $\varphi$ in a large database $\mathcal{D}$. We establish an exhaustive classification into tractable and intractable cases if $\varphi$ is a conjunctive query with disequalities and negations: $\bullet$ If there is a constant bound on the arity of $\varphi$, and if the randomised Exponential Time Hypothesis (rETH) holds, then the problem has a fixed-parameter tractable approximation scheme (FPTRAS) if and only if the treewidth of $\varphi$ is bounded. $\bullet$ If the arity is unbounded and we allow disequalities only, then the problem has an FPTRAS if and only if the adaptive width of $\varphi$ (a width measure strictly more general than treewidth) is bounded; the lower bound relies on the rETH as well. Additionally we show that our results cannot be strengthened to achieve a fully polynomial randomised approximation scheme (FPRAS): We observe that, unless $\mathrm{NP} =\mathrm{RP}$, there is no FPRAS even if the treewidth (and the adaptive width) is $1$. However, if there are neither disequalities nor negations, we prove the existence of an FPRAS for queries of bounded fractional hypertreewidth, strictly generalising the recently established FPRAS for conjunctive queries with bounded hypertreewidth due to Arenas, Croquevielle, Jayaram and Riveros (STOC 2021).

Approximately Counting Answers to Conjunctive Queries with Disequalities and Negations

TL;DR

The paper tackles the problem of approximately counting answers to extended conjunctive queries (ECQs), including disequalities and negations, in large databases. It provides a complete parameterised classification: an FPTRAS is achievable for ECQs with bounded treewidth and arity, and for DCQs in the unbounded-arity setting when adaptive width is bounded; it also shows that no FPRAS exists in general unless NP = RP, even under strong width restrictions, while giving an FPRAS for CQ when the underlying hypergraphs have bounded fractional hypertreewidth. The methodology hinges on reducing counting to hyperedge counting via decision oracles, employing colour-coding to handle quantified variables, and leveraging tree automata for the -bounded regime; these techniques also yield an FPTRAS for locally injective homomorphisms. The results advance both the theory and practical toolbox for approximate query counting, with implications for sampling and unions of queries and for broader classes of width measures.

Abstract

We study the complexity of approximating the number of answers to a small query in a large database . We establish an exhaustive classification into tractable and intractable cases if is a conjunctive query with disequalities and negations: If there is a constant bound on the arity of , and if the randomised Exponential Time Hypothesis (rETH) holds, then the problem has a fixed-parameter tractable approximation scheme (FPTRAS) if and only if the treewidth of is bounded. If the arity is unbounded and we allow disequalities only, then the problem has an FPTRAS if and only if the adaptive width of (a width measure strictly more general than treewidth) is bounded; the lower bound relies on the rETH as well. Additionally we show that our results cannot be strengthened to achieve a fully polynomial randomised approximation scheme (FPRAS): We observe that, unless , there is no FPRAS even if the treewidth (and the adaptive width) is . However, if there are neither disequalities nor negations, we prove the existence of an FPRAS for queries of bounded fractional hypertreewidth, strictly generalising the recently established FPRAS for conjunctive queries with bounded hypertreewidth due to Arenas, Croquevielle, Jayaram and Riveros (STOC 2021).

Paper Structure

This paper contains 26 sections, 18 theorems, 22 equations, 1 figure.

Key Result

Theorem 5

Let $t$ and $a$ be positive integers. Let $C$ be a class of hypergraphs such that every member of $C$ has treewidth at most $t$ and arity at most $a$. Then $\mathrm{\#ECQ}\left(\Phi_C\right)$ has an FPTRAS, running in time $\exp(\mathrm{O}(||\varphi||^2)) \cdot \mathrm{poly}(\log(1/\delta),\varepsil

Figures (1)

  • Figure 1: Overview of our results on approximately counting answers to conjunctive queries ($\mathrm{CQ}$s), to conjunctive queries with disequalities ($\mathrm{DCQ}$s), and to conjunctive queries with disequalities and negations ($\mathrm{ECQ}$s). Upper and lower bounds depend on a variety of width measures of the input queries, namely, treewidth (tw), hypertreewidth (hw), fractional hypertreewidth (fhw), and adaptive width (aw). The equivalence of the width parameters in the case of bounded arity is well known; we provide an explicit argument in Observation \ref{['obs:tw_vs_aw']}. For completeness, we also compare our results to recent work of Arenas et al. ArenasNew. The lower bounds either rely on the assumption that $\mathrm{NP} \neq \mathrm{RP}$ or on the rETH. All referenced theorems and observations are stated in Section \ref{['sec:results']}. Note that, while our results complete the picture of the complexity of $\mathrm{CQ}$, $\mathrm{DCQ}$, and $\mathrm{ECQ}$ in the bounded arity case, two questions remain open for the unbounded arity case: Assuming the adaptive width is bounded, does $\mathrm{ECQ}$ have an FPTRAS, and does $\mathrm{CQ}$ have an FPRAS?

Theorems & Definitions (52)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 5
  • Corollary 6
  • Conjecture 7: rETH, ImpagliazzoP01
  • Theorem 8: Marx10:ToC
  • proof
  • Definition 11: weakly dominated, strongly dominated, weakly equivalent
  • ...and 42 more