Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins
Kyle Deeds, Timo Camillo Merkl
TL;DR
This work introduces Partition Constraints (PCs) as a declarative generalization of Degree Constraints (DCs) that partitions relations into subparts with tighter degree limits to refine join-size bounds and improve worst-case optimal join (WCOJ) algorithms. It establishes formal definitions, provides linear-time approximate and quadratic-time exact algorithms to compute partitions witnessing PCs, and demonstrates lifting of existing DC-based bounds and WCOJ techniques to the PC framework. The Hexagon Query illustrates that PC-based bounds can be asymptotically tighter than DC-based bounds and enable linear-time enumeration where VAAT approaches require suboptimal runtimes, while still enabling WCOJ guarantees when combined with DC-derived bounds. The paper also extends these ideas to general conjunctive queries, offering decomposition methods and showing that PC-aware algorithms inherit tight worst-case behavior from the underlying DC theory, with practical paths for integration into query engines and benchmarks for further refinement.
Abstract
In the last decade, various works have used statistics on relations to improve both the theory and practice of conjunctive query execution. Starting with the AGM bound which took advantage of relation sizes, later works incorporated statistics like functional dependencies and degree constraints. Each new statistic prompted work along two lines; bounding the size of conjunctive query outputs and worst-case optimal join algorithms. In this work, we continue in this vein by introducing a new statistic called a \emph{partition constraint}. This statistic captures latent structure within relations by partitioning them into sub-relations which each have much tighter degree constraints. We show that this approach can both refine existing cardinality bounds and improve existing worst-case optimal join algorithms.
