Table of Contents
Fetching ...

Budget-aware Query Tuning: An AutoML Perspective

Wentao Wu, Chi Wang

TL;DR

This work reframes the database cost model by treating cost units as tunable parameters and demonstrates that budget-aware, AutoML-inspired tuning of these units can produce much faster query plans than default optimizer choices. It formulates Budget-aware Query Tuning (QT) and Budget-aware Workload Tuning (WT), drawing direct parallels to hyperparameter optimization and proposing practical strategies to operate under tuning time budgets. The authors introduce QT and WT solutions, including plan caching and early stopping for QT, and four WT strategies (Round Robin, Cost-based Prioritization, Multi-armed Bandit, Improvement Rate), validating them on benchmark (JOB, TPC-DS) and real workloads. Experimental results show meaningful improvements in execution time and substantial reductions in tuning time for WT, supporting the practical viability of budget-constrained, data-driven plan optimization. The work lays a foundation for further leveraging workload similarity and clustering to enhance Bayesian optimization-based tuning in future research.

Abstract

Modern database systems rely on cost-based query optimizers to come up with good execution plans for input queries. Such query optimizers rely on cost models to estimate the costs of candidate query execution plans. A cost model represents a function from a set of cost units to query execution cost, where each cost unit specifies the unit cost of executing a certain type of query processing operation (such as table scan or join). These cost units are traditionally viewed as constants, whose values only depend on the platform configuration where the database system runs on top of but are invariant for queries processed by the database system. In this paper, we challenge this classic view by thinking of these cost units as variables instead. We show that, by varying the cost-unit values one can obtain query plans that significantly outperform the default query plans returned by the query optimizer when viewing the cost units as constants. We term this cost-unit tuning process "query tuning" (QT) and show that it is similar to the well-known hyper-parameter optimization (HPO) problem in AutoML. As a result, any state-of-the-art HPO technologies can be applied to QT. We study the QT problem in the context of anytime tuning, which is desirable in practice by constraining the total time spent on QT within a given budget -- we call this problem budget-aware query tuning. We further extend our study from tuning a single query to tuning a workload with multiple queries, and we call this generalized problem budget-aware workload tuning (WT), which aims for minimizing the execution time of the entire workload. WT is more challenging as one needs to further prioritize individual query tuning within the given time budget. We propose solutions to both QT and WT and experimental evaluation using both benchmark and real workloads demonstrates the efficacy of our proposed solutions.

Budget-aware Query Tuning: An AutoML Perspective

TL;DR

This work reframes the database cost model by treating cost units as tunable parameters and demonstrates that budget-aware, AutoML-inspired tuning of these units can produce much faster query plans than default optimizer choices. It formulates Budget-aware Query Tuning (QT) and Budget-aware Workload Tuning (WT), drawing direct parallels to hyperparameter optimization and proposing practical strategies to operate under tuning time budgets. The authors introduce QT and WT solutions, including plan caching and early stopping for QT, and four WT strategies (Round Robin, Cost-based Prioritization, Multi-armed Bandit, Improvement Rate), validating them on benchmark (JOB, TPC-DS) and real workloads. Experimental results show meaningful improvements in execution time and substantial reductions in tuning time for WT, supporting the practical viability of budget-constrained, data-driven plan optimization. The work lays a foundation for further leveraging workload similarity and clustering to enhance Bayesian optimization-based tuning in future research.

Abstract

Modern database systems rely on cost-based query optimizers to come up with good execution plans for input queries. Such query optimizers rely on cost models to estimate the costs of candidate query execution plans. A cost model represents a function from a set of cost units to query execution cost, where each cost unit specifies the unit cost of executing a certain type of query processing operation (such as table scan or join). These cost units are traditionally viewed as constants, whose values only depend on the platform configuration where the database system runs on top of but are invariant for queries processed by the database system. In this paper, we challenge this classic view by thinking of these cost units as variables instead. We show that, by varying the cost-unit values one can obtain query plans that significantly outperform the default query plans returned by the query optimizer when viewing the cost units as constants. We term this cost-unit tuning process "query tuning" (QT) and show that it is similar to the well-known hyper-parameter optimization (HPO) problem in AutoML. As a result, any state-of-the-art HPO technologies can be applied to QT. We study the QT problem in the context of anytime tuning, which is desirable in practice by constraining the total time spent on QT within a given budget -- we call this problem budget-aware query tuning. We further extend our study from tuning a single query to tuning a workload with multiple queries, and we call this generalized problem budget-aware workload tuning (WT), which aims for minimizing the execution time of the entire workload. WT is more challenging as one needs to further prioritize individual query tuning within the given time budget. We propose solutions to both QT and WT and experimental evaluation using both benchmark and real workloads demonstrates the efficacy of our proposed solutions.
Paper Structure (16 sections, 9 equations, 4 figures, 3 tables)

This paper contains 16 sections, 9 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Percentage improvement of each query in the JOB workload.
  • Figure 2: Execution time of default plan vs. best plan found by QT for each query of the JOB workload.
  • Figure 3: Percentage improvement resulted from workload tuning when varying the budget on tuning time.
  • Figure 4: Comparison of WT algorithms on TPC-DS

Theorems & Definitions (2)

  • Definition 1: Budget-aware Query Tuning
  • Definition 2: Budget-aware Workload Tuning