DPconv: Super-Polynomially Faster Join Ordering
Mihail Stoian, Andreas Kipf
TL;DR
DPconv reframes the join ordering problem as a subset-convolution DP, breaking the longstanding O(3^n) barrier via fast subset convolution and layer-wise optimizations. It instantiates the framework for two classic cost models, C_out and C_max, yielding an O(2^n n^2 W n log W n) time solution for C_out (nearly ~2^n when W is poly(n)) and an O(2^n n^3) time solution for C_max, with a practical simple algorithm and an approximation route. The work also introduces a two-phase C_cap objective to jointly optimize time and space, and provides experimental evidence of substantial speedups on clique-like queries, including up to 30x improvements over DPccp for large n. In addition, it presents an approximation approach with sub-exponential- in-n guarantees and discusses extensions to other cost functions and hypergraphs, highlighting practical implications for resource-aware query optimization and memory-constrained environments.
Abstract
We revisit the join ordering problem in query optimization. The standard exact algorithm, DPccp, has a worst-case running time of $O(3^n)$. This is prohibitively expensive for large queries, which are not that uncommon anymore. We develop a new algorithmic framework based on subset convolution. DPconv achieves a super-polynomial speedup over DPccp, breaking the $O(3^n)$ time-barrier for the first time. We show that the instantiation of our framework for the $C_\max$ cost function is up to 30x faster than DPccp for large clique queries.
