DPconv: Super-Polynomially Faster Join Ordering

Mihail Stoian; Andreas Kipf

DPconv: Super-Polynomially Faster Join Ordering

Mihail Stoian, Andreas Kipf

TL;DR

DPconv reframes the join ordering problem as a subset-convolution DP, breaking the longstanding O(3^n) barrier via fast subset convolution and layer-wise optimizations. It instantiates the framework for two classic cost models, C_out and C_max, yielding an O(2^n n^2 W n log W n) time solution for C_out (nearly ~2^n when W is poly(n)) and an O(2^n n^3) time solution for C_max, with a practical simple algorithm and an approximation route. The work also introduces a two-phase C_cap objective to jointly optimize time and space, and provides experimental evidence of substantial speedups on clique-like queries, including up to 30x improvements over DPccp for large n. In addition, it presents an approximation approach with sub-exponential- in-n guarantees and discusses extensions to other cost functions and hypergraphs, highlighting practical implications for resource-aware query optimization and memory-constrained environments.

Abstract

We revisit the join ordering problem in query optimization. The standard exact algorithm, DPccp, has a worst-case running time of $O(3^n)$. This is prohibitively expensive for large queries, which are not that uncommon anymore. We develop a new algorithmic framework based on subset convolution. DPconv achieves a super-polynomial speedup over DPccp, breaking the $O(3^n)$ time-barrier for the first time. We show that the instantiation of our framework for the $C_\max$ cost function is up to 30x faster than DPccp for large clique queries.

DPconv: Super-Polynomially Faster Join Ordering

TL;DR

Abstract

We revisit the join ordering problem in query optimization. The standard exact algorithm, DPccp, has a worst-case running time of

. This is prohibitively expensive for large queries, which are not that uncommon anymore. We develop a new algorithmic framework based on subset convolution. DPconv achieves a super-polynomial speedup over DPccp, breaking the

time-barrier for the first time. We show that the instantiation of our framework for the

cost function is up to 30x faster than DPccp for large clique queries.

Paper Structure (42 sections, 3 theorems, 17 equations, 9 figures, 3 algorithms)

This paper contains 42 sections, 3 theorems, 17 equations, 9 figures, 3 algorithms.

Introduction
Background
Query Graph
Cost Function
Join Ordering and Dynamic Programming
Subset Convolution
Rings & Semi-Rings
Our Framework
Join Ordering Meets Subset Convolution
Embedding Technique
Instantiating $C_{\mathrm{out}}$
Instantiating $C_{\max}$
Beyond $C_{\mathrm{out}}$ and $C_{\max}$
Fast Subset Convolution
Zeta Transform
...and 27 more sections

Key Result

theorem 1

$(1+\varepsilon)$-Approximate min-sum subset convolution can be solved in $\widetilde{O}(2^\frac{3n}{2} / \sqrt\varepsilon)$-time.

Figures (9)

Figure 1: How join ordering dynamic programming algorithms, e.g., DPsub, are implicitly using subset convolution. However, they are computing it naively. DPconv instead uses a highly-tuned implementation of fast subset convolutionfsc.
Figure 2: Visualizing the fast subset convolution (FSC), outlined in Lst. \ref{['lst:fsc_impl']}: ① We rank the set functions $f$ and $g$ and ② apply the zeta transform to obtain $\zeta f$ and $\zeta g$, respectively. ③ We perform the ranked convolution between $\zeta f$ and $\zeta g$. ④ We apply the Möbius transform to obtain the ranked $h$. ⑤ Finally, we reconstitute $h = f \ast g$, the actual subset convolution. We highlight in color the steps needed to compute the second rank "slice" of $\zeta h$, namely $(\zeta h)(:, 2)$, during ranked convolution (as in Sec. \ref{['subsec:ranked_convolution']}). Intuitively, we need to sum up the dot products between the corresponding slices, i.e., $(\zeta f)(:, 0)$ with $(\zeta g)(:, 2)$, $(\zeta f)(:, 1)$ with $(\zeta g)(:, 1)$, and $(\zeta f)(:, 2)$ with $(\zeta g)(:, 0)$.
Figure 3: Visualizing Alg. \ref{['algo:dpconv_max_simpler']}.
Figure 4: Theoretical number of operations of the exact $O(3^n)$-time algorithm and the $\widetilde{O}(2^{3n/2} / \sqrt{\varepsilon})$-time $(1+\varepsilon)$-approximation algorithm for $n = 40$ and varying $\varepsilon$'s.
Figure 5: Overhead in optimization time for $C_{\mathrm{cap}}$ on JOB job_first_paper and CEB ceb, i.e., optimizing $C_{\mathrm{out}}$ under the constraint that the largest intermediate size is the same as when optimizing with $C_{\max}$ (two optimization phases).
...and 4 more figures

Theorems & Definitions (3)

theorem 1: stoian_approx
theorem 2
corollary 1

DPconv: Super-Polynomially Faster Join Ordering

TL;DR

Abstract

DPconv: Super-Polynomially Faster Join Ordering

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (3)