Reqo: A Comprehensive Learning-Based Cost Model for Robust and Explainable Query Optimization

Baoming Chang; Amin Kamali; Verena Kantere

Reqo: A Comprehensive Learning-Based Cost Model for Robust and Explainable Query Optimization

Baoming Chang, Amin Kamali, Verena Kantere

TL;DR

Reqo tackles robustness and explainability in learning-based query optimization by jointly addressing plan generation, representation, and plan selection. It introduces a novel Bi-GNN+GRU plan representation, a subplan-based explainability technique that derives plan-generation hints, and an uncertainty-aware learning-to-rank estimator that learns to balance cost and uncertainty. The methods form a feedback loop where representations improve explanations and robustness, explanations guide generation hints, and comparisons refine estimates. Experiments on diverse benchmarks show Reqo consistently surpasses state-of-the-art baselines in cost estimation accuracy, plan quality, robustness, and explainability.

Abstract

Although machine learning (ML) shows potential in improving query optimization by generating and selecting more efficient plans, ensuring the robustness of learning-based cost models (LCMs) remains challenging. These LCMs currently lack explainability, which undermines user trust and limits the ability to derive insights from their cost predictions to improve plan quality. Accurately converting tree-structured query plans into representations via tree models is also essential, as omitting any details may negatively impact subsequent cost model performance. Additionally, inherent uncertainty in cost estimation leads to inaccurate predictions, resulting in suboptimal plan selection. To address these challenges, we introduce Reqo, a Robust and Explainable Query Optimization cost model that comprehensively enhances three main stages in query optimization: plan generation, plan representation, and plan selection. Reqo integrates three innovations: the first explainability technique for LCMs that quantifies subgraph contributions and produces plan generation hints to enhance candidate plan quality; a novel tree model based on Bidirectional Graph Neural Networks (Bi-GNNs) with a Gated Recurrent Unit (GRU) aggregator to further capture both node-level and structural information and effectively strengthen plan representation; and an uncertainty-aware learning-to-rank cost estimator that adaptively integrates cost estimates with uncertainties to enhance plan selection robustness. Extensive experiments demonstrate that Reqo outperforms state-of-the-art approaches across all three stages.

Reqo: A Comprehensive Learning-Based Cost Model for Robust and Explainable Query Optimization

TL;DR

Abstract

Reqo: A Comprehensive Learning-Based Cost Model for Robust and Explainable Query Optimization

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)

Theorems & Definitions (5)