Approximately Counting Answers to Conjunctive Queries with Disequalities and Negations
Jacob Focke, Leslie Ann Goldberg, Marc Roth, Stanislav Živný
TL;DR
The paper tackles the problem of approximately counting answers to extended conjunctive queries (ECQs), including disequalities and negations, in large databases. It provides a complete parameterised classification: an FPTRAS is achievable for ECQs with bounded treewidth and arity, and for DCQs in the unbounded-arity setting when adaptive width is bounded; it also shows that no FPRAS exists in general unless NP = RP, even under strong width restrictions, while giving an FPRAS for CQ when the underlying hypergraphs have bounded fractional hypertreewidth. The methodology hinges on reducing counting to hyperedge counting via decision oracles, employing colour-coding to handle quantified variables, and leveraging tree automata for the $fhw$-bounded regime; these techniques also yield an FPTRAS for locally injective homomorphisms. The results advance both the theory and practical toolbox for approximate query counting, with implications for sampling and unions of queries and for broader classes of width measures.
Abstract
We study the complexity of approximating the number of answers to a small query $\varphi$ in a large database $\mathcal{D}$. We establish an exhaustive classification into tractable and intractable cases if $\varphi$ is a conjunctive query with disequalities and negations: $\bullet$ If there is a constant bound on the arity of $\varphi$, and if the randomised Exponential Time Hypothesis (rETH) holds, then the problem has a fixed-parameter tractable approximation scheme (FPTRAS) if and only if the treewidth of $\varphi$ is bounded. $\bullet$ If the arity is unbounded and we allow disequalities only, then the problem has an FPTRAS if and only if the adaptive width of $\varphi$ (a width measure strictly more general than treewidth) is bounded; the lower bound relies on the rETH as well. Additionally we show that our results cannot be strengthened to achieve a fully polynomial randomised approximation scheme (FPRAS): We observe that, unless $\mathrm{NP} =\mathrm{RP}$, there is no FPRAS even if the treewidth (and the adaptive width) is $1$. However, if there are neither disequalities nor negations, we prove the existence of an FPRAS for queries of bounded fractional hypertreewidth, strictly generalising the recently established FPRAS for conjunctive queries with bounded hypertreewidth due to Arenas, Croquevielle, Jayaram and Riveros (STOC 2021).
