Table of Contents
Fetching ...

Bag Semantics Conjunctive Query Containment. Four Small Steps Towards Undecidability

Jerzy Marcinkowski, Mateusz Orda

TL;DR

The paper investigates the bag semantics version of the Query Containment Problem for conjunctive queries, showing a sequence of undecidability results for generalized bag-based CQ containment, even in boolean settings with limited inequalities. It builds a polynomial-encoding framework that reduces Hilbert’s 10th problem to questions about whether a scaled small-query count can be bounded by a big-query count across all databases, leveraging a complex Arena gadget and a suite of auxiliary constructions (CYCLIQ, pi_s, pi_b, Arena). The core contributions include establishing undecidability with fixed multipliers and with a single inequality, and linking these results to the broader undecidability landscape of bag semantics QCP, while also showing how to translate between different problem variants via blow-ups and products. Overall, the work delineates strong limits on decidability for bag-semantics query containment and clarifies the role of polynomial encodings in such reductions, with potential implications for database optimization and theory.

Abstract

Query Containment Problem (QCP) is one of the most fundamental decision problems in database query processing and optimization. Complexity of QCP for conjunctive queries (QCP-CQ) has been fully understood since 1970s. But, as Chaudhuri and Vardi noticed in their classical 1993 paper [1], this understanding is based on the assumption that query answers are sets of tuples, and it does not transfer to the situation when multi-set (bag) semantics is considered. Now, 30 years after [1] was written, decidability of QCP-CQ for bag semantics remains an open question, one of the most intriguing open questions in database theory. In this paper we show a series of undecidability results for some generalizations of bag-semantics QCP-CQ. We show, for example, that the problem whether, for given two boolean conjunctive queries Q and Q' , and a linear function F, the inequality F(Q(D)) =< Q'(D) holds for each database instance D, is undecidable

Bag Semantics Conjunctive Query Containment. Four Small Steps Towards Undecidability

TL;DR

The paper investigates the bag semantics version of the Query Containment Problem for conjunctive queries, showing a sequence of undecidability results for generalized bag-based CQ containment, even in boolean settings with limited inequalities. It builds a polynomial-encoding framework that reduces Hilbert’s 10th problem to questions about whether a scaled small-query count can be bounded by a big-query count across all databases, leveraging a complex Arena gadget and a suite of auxiliary constructions (CYCLIQ, pi_s, pi_b, Arena). The core contributions include establishing undecidability with fixed multipliers and with a single inequality, and linking these results to the broader undecidability landscape of bag semantics QCP, while also showing how to translate between different problem variants via blow-ups and products. Overall, the work delineates strong limits on decidability for bag-semantics query containment and clarifies the role of polynomial encodings in such reductions, with potential implications for database optimization and theory.

Abstract

Query Containment Problem (QCP) is one of the most fundamental decision problems in database query processing and optimization. Complexity of QCP for conjunctive queries (QCP-CQ) has been fully understood since 1970s. But, as Chaudhuri and Vardi noticed in their classical 1993 paper [1], this understanding is based on the assumption that query answers are sets of tuples, and it does not transfer to the situation when multi-set (bag) semantics is considered. Now, 30 years after [1] was written, decidability of QCP-CQ for bag semantics remains an open question, one of the most intriguing open questions in database theory. In this paper we show a series of undecidability results for some generalizations of bag-semantics QCP-CQ. We show, for example, that the problem whether, for given two boolean conjunctive queries Q and Q' , and a linear function F, the inequality F(Q(D)) =< Q'(D) holds for each database instance D, is undecidable

Paper Structure

This paper contains 29 sections, 29 theorems, 63 equations.

Key Result

Theorem 1

The problem: Given are boolean conjunctive queries (without inequality) $\phi_s$ and $\phi_b$, and a natural number $\mathbbmss{c}$. Does $\mathbbmss{c}\phi_s(D) \leq \phi_b(D)$ hold for each non-trivial database $D$? is undecidable.

Theorems & Definitions (35)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Lemma 1
  • Definition 2
  • Definition 3
  • Lemma 4
  • Lemma 5
  • ...and 25 more