Table of Contents
Fetching ...

PLANSIEVE: Real-time Suboptimal Query Plan Detection Through Incremental Refinements

Asoke Datta, Yesdaulet Izenov, Brian Tsan, Abylay Amanbayev, Florin Rusu

TL;DR

PLANSIEVE tackles the challenge of sub-optimal query plans caused by cardinality-estimation errors by predicting plan sub-optimality during query optimization rather than after execution. It combines a transformer-based Classification Model with a Cardinality Collector that uses surrogate cardinalities from a Third-Party Estimator and progressively learns true cardinalities from execution logs, refining predictions online. The framework computes relative subplan orderings via true and estimated cardinalities and incorporates the L1-error as a key feature, enabling proactive adjustments to the optimizer. Empirical results on augmented JOB-LIGHT-SCALE and STATS-CEB-SCALE workloads show that PLANSIEVE achieves up to 88.7% accuracy in sub-optimality prediction, with improved performance as true cardinalities are observed, demonstrating its potential to reduce the cost of sub-optimal plans in real-world systems.

Abstract

Cardinality estimation remains a fundamental challenge in query optimization, often resulting in sub-optimal execution plans and degraded performance. While errors in cardinality estimation are inevitable, existing methods for identifying sub-optimal plans -- such as metrics like Q-error, P-error, or L1-error -- are limited to post-execution analysis, requiring complete knowledge of true cardinalities and failing to prevent the execution of sub-optimal plans in real-time. This paper introduces PLANSIEVE, a novel framework that identifies sub-optimal plans during query optimization. PLANSIEVE operates by analyzing the relative order of sub-plans generated by the optimizer based on estimated and true cardinalities. It begins with surrogate cardinalities from any third-party estimator and incrementally refines these surrogates as the system processes more queries. Experimental results on the augmented JOB-LIGHT-SCALE and STATS-CEB-SCALE workloads demonstrate that PLANSIEVE achieves an accuracy of up to 88.7\% in predicting sub-optimal plans.

PLANSIEVE: Real-time Suboptimal Query Plan Detection Through Incremental Refinements

TL;DR

PLANSIEVE tackles the challenge of sub-optimal query plans caused by cardinality-estimation errors by predicting plan sub-optimality during query optimization rather than after execution. It combines a transformer-based Classification Model with a Cardinality Collector that uses surrogate cardinalities from a Third-Party Estimator and progressively learns true cardinalities from execution logs, refining predictions online. The framework computes relative subplan orderings via true and estimated cardinalities and incorporates the L1-error as a key feature, enabling proactive adjustments to the optimizer. Empirical results on augmented JOB-LIGHT-SCALE and STATS-CEB-SCALE workloads show that PLANSIEVE achieves up to 88.7% accuracy in sub-optimality prediction, with improved performance as true cardinalities are observed, demonstrating its potential to reduce the cost of sub-optimal plans in real-world systems.

Abstract

Cardinality estimation remains a fundamental challenge in query optimization, often resulting in sub-optimal execution plans and degraded performance. While errors in cardinality estimation are inevitable, existing methods for identifying sub-optimal plans -- such as metrics like Q-error, P-error, or L1-error -- are limited to post-execution analysis, requiring complete knowledge of true cardinalities and failing to prevent the execution of sub-optimal plans in real-time. This paper introduces PLANSIEVE, a novel framework that identifies sub-optimal plans during query optimization. PLANSIEVE operates by analyzing the relative order of sub-plans generated by the optimizer based on estimated and true cardinalities. It begins with surrogate cardinalities from any third-party estimator and incrementally refines these surrogates as the system processes more queries. Experimental results on the augmented JOB-LIGHT-SCALE and STATS-CEB-SCALE workloads demonstrate that PLANSIEVE achieves an accuracy of up to 88.7\% in predicting sub-optimal plans.

Paper Structure

This paper contains 51 sections, 2 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: L1-error calculation for query qry_68_9.
  • Figure 2: PLANSEIVE framework architecture.
  • Figure 3: Execution plan for query t_68_9.
  • Figure 4: Classification model architecture.
  • Figure 5: Subplan featurization as one-hot vector (first five 2-way joins from Figure \ref{['fig:all_subs']}).
  • ...and 6 more figures