Table of Contents
Fetching ...

Succinct Structure Representations for Efficient Query Optimization

Zhekai Jiang, Qichen Wang, Christoph Koch

Abstract

Structural decomposition methods offer powerful theoretical guarantees for join evaluation, yet they are rarely used in real-world query optimizers. A major reason is the difficulty of combining cost-based plan search and structure-based evaluation. In this work, we bridge this gap by introducing meta-decompositions for acyclic queries, a novel representation that succinctly represents all possible join trees and enables their efficient enumeration. Meta-decompositions can be constructed in polynomial time and have sizes linear in the query size. We design an efficient polynomial-time cost-based optimizer based directly on the meta-decomposition, without the need to explicitly enumerate all possible join trees. We characterize plans found by this approach using a novel notion of width, which effectively implies the theoretical worst-case asymptotic bounds of intermediate result sizes and running time of any query plan. Experimental results demonstrate that, in practice, the plans in our class are consistently comparable to -- even in many cases better than -- the optimal ones found by the state-of-the-art dynamic programming approach, especially on large and complex queries, while our planning process runs by orders of magnitude faster, comparable to the time taken by common heuristic methods.

Succinct Structure Representations for Efficient Query Optimization

Abstract

Structural decomposition methods offer powerful theoretical guarantees for join evaluation, yet they are rarely used in real-world query optimizers. A major reason is the difficulty of combining cost-based plan search and structure-based evaluation. In this work, we bridge this gap by introducing meta-decompositions for acyclic queries, a novel representation that succinctly represents all possible join trees and enables their efficient enumeration. Meta-decompositions can be constructed in polynomial time and have sizes linear in the query size. We design an efficient polynomial-time cost-based optimizer based directly on the meta-decomposition, without the need to explicitly enumerate all possible join trees. We characterize plans found by this approach using a novel notion of width, which effectively implies the theoretical worst-case asymptotic bounds of intermediate result sizes and running time of any query plan. Experimental results demonstrate that, in practice, the plans in our class are consistently comparable to -- even in many cases better than -- the optimal ones found by the state-of-the-art dynamic programming approach, especially on large and complex queries, while our planning process runs by orders of magnitude faster, comparable to the time taken by common heuristic methods.
Paper Structure (77 sections, 22 theorems, 12 equations, 23 figures, 5 tables, 8 algorithms)

This paper contains 77 sections, 22 theorems, 12 equations, 23 figures, 5 tables, 8 algorithms.

Key Result

theorem 1

Consider star queriesWe note that, in some literature, e.g., birler_optimizing_2025, such queries are called "clique" queries, because their query graphs are cliques. of the form where for any $i, j \in [n]$, $\bar{x}_i \cap \bar{x}_j = \lbrace x\rbrace$, for some attribute $x$. There are $n^{n-1}$ possible join trees for such a star query with $n$ relations.

Figures (23)

  • Figure 1: The hypergraph, a join tree, the query graph, and two query plans of the query $Q_{\ref{['ex:123-12-13-23-intro']}}$ in \ref{['ex:123-12-13-23-intro']}
  • Figure 2: Hypergraph and two valid join trees of the star query $Q_{\ref{['ex:star']}}$ in \ref{['ex:star']}
  • Figure 3: The hypergraph, a join tree, a width-1 query plan, and a width-2 query plan of the query $Q_{\ref{['ex:hierarchical']}}$
  • Figure 4: Meta-decomposition of the query $Q_{\ref{['ex:star']}}$
  • Figure 5: Meta-decomposition of the query $Q_{\ref{['ex:hierarchical']}}$
  • ...and 18 more figures

Theorems & Definitions (34)

  • theorem 1
  • theorem 2
  • definition 1
  • theorem 3
  • theorem 4
  • theorem 5
  • theorem 6
  • definition 2
  • proposition 1
  • theorem 7
  • ...and 24 more