Table of Contents
Fetching ...

Learned Offline Query Planning via Bayesian Optimization

Jeffrey Tao, Natalie Maus, Haydn Jones, Yimeng Zeng, Jacob R. Gardner, Ryan Marcus

TL;DR

This paper tackles offline query optimization for repetitive analytic workloads by proposing BayesQO, which uses a variational autoencoder to embed query plans into a latent space and applies Bayesian optimization to discover fast plans. The method handles timeouts as censored observations and leverages initialization strategies from traditional optimizers and cross-query fine-tuned language models to improve search efficiency. Empirically, BayesQO outperforms online learned optimizers and hint-based baselines across multiple benchmarks, sometimes yielding substantial plan latency reductions, and demonstrates robustness to data drift with strategies for retraining and reoptimization. The work highlights practical implications for deploying offline optimization in real DBMS deployments and points to future directions in integrating with broader system design choices and leveraging learned initializations for rapid adaptation.

Abstract

Analytics database workloads often contain queries that are executed repeatedly. Existing optimization techniques generally prioritize keeping optimization cost low, normally well below the time it takes to execute a single instance of a query. If a given query is going to be executed thousands of times, could it be worth investing significantly more optimization time? In contrast to traditional online query optimizers, we propose an offline query optimizer that searches a wide variety of plans and incorporates query execution as a primitive. Our offline query optimizer combines variational auto-encoders with Bayesian optimization to find optimized plans for a given query. We compare our technique to the optimal plans possible with PostgreSQL and recent RL-based systems over several datasets, and show that our technique finds faster query plans.

Learned Offline Query Planning via Bayesian Optimization

TL;DR

This paper tackles offline query optimization for repetitive analytic workloads by proposing BayesQO, which uses a variational autoencoder to embed query plans into a latent space and applies Bayesian optimization to discover fast plans. The method handles timeouts as censored observations and leverages initialization strategies from traditional optimizers and cross-query fine-tuned language models to improve search efficiency. Empirically, BayesQO outperforms online learned optimizers and hint-based baselines across multiple benchmarks, sometimes yielding substantial plan latency reductions, and demonstrates robustness to data drift with strategies for retraining and reoptimization. The work highlights practical implications for deploying offline optimization in real DBMS deployments and points to future directions in integrating with broader system design choices and leveraging learned initializations for rapid adaptation.

Abstract

Analytics database workloads often contain queries that are executed repeatedly. Existing optimization techniques generally prioritize keeping optimization cost low, normally well below the time it takes to execute a single instance of a query. If a given query is going to be executed thousands of times, could it be worth investing significantly more optimization time? In contrast to traditional online query optimizers, we propose an offline query optimizer that searches a wide variety of plans and incorporates query execution as a primitive. Our offline query optimizer combines variational auto-encoders with Bayesian optimization to find optimized plans for a given query. We compare our technique to the optimal plans possible with PostgreSQL and recent RL-based systems over several datasets, and show that our technique finds faster query plans.

Paper Structure

This paper contains 21 sections, 8 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: A comparison of reinforcement learning (RL, left) and Bayesian optimization (BO, right). BO is a better match for the offline query optimization problem because we do not care about "regressions" in the offline search phase; we only care about the quality of the best-discovered plan.
  • Figure 2: BayesQO workflow
  • Figure 3: Best plans found at the end of optimization with each technique for each workload. Towards the top right corner is better. ❶ On each workload, Random fails to find any improvement over the best Bao plan for 30-50% of queries. ❷ BayesQO always finds the most improvement compared to the other methods, with this difference being more pronounced on JOB and Stack than on CEB. On Stack, BayesQO finds over 2× improvement on 50% of queries. ❸ Balsa underperforms when used as an offline optimizer on Stack due to longer query runtimes limiting exploration by the RL algorithm. ❹ BayesQO finds improvement on the 25% of Stack queries where the other two techniques find none.
  • Figure 4: Case studies showing optimization time vs. plan performance for individual queries. Lower is better.
  • Figure 5: Ablation study of our novel BO scheme.
  • ...and 5 more figures