Table of Contents
Fetching ...

[Experiment, Analysis, and Benchmark] Systematic Evaluation of Plan-based Adaptive Query Processing

Pei Mu, Anderson Chaves Carniel, Antonio Barbalace, Amir Shaikhha

TL;DR

The paper tackles the persistent problem of unreliable cardinality estimation in DBMSs by systematically evaluating plan-based Adaptive Query Processing (AQP) across on-disk PostgreSQL and in-memory DuckDB. It proposes a unified, modular design to isolate and quantify the contributions of monitors, plan splitters, and sub-plan selectors, implementing these in both PostgreSQL (AQP-PostgreSQL) and DuckDB (AQP-DuckDB). The experimental results reveal that on-disk systems primarily gain from plan reorderings, while in-memory systems benefit mainly from cardinality refinement, with plan-based AQP outperforming relation-based AQP on complex multi-join workloads. These insights provide practical guidance for researchers and database engineers on when and how to apply plan-based AQP, and highlight avenues for improving split-point decision-making and executor co-design to maximize performance gains.

Abstract

Unreliable cardinality estimation remains a critical performance bottleneck in database management systems (DBMSs). Adaptive Query Processing (AQP) strategies address this limitation by providing a more robust query execution mechanism. Specifically, plan-based AQP achieves this by incrementally refining cardinality using feedback from the execution of sub-plans. However, the actual reason behind the improvements of plan-based AQP, especially across different storage architectures (on-disk vs. in-memory DBMSs), remains unexplored. This paper presents the first comprehensive analysis of state-of-the-art plan-based AQP. We implement and evaluate this strategy on both on-disk and in-memory DBMSs across two benchmarks. Our key findings reveal that while plan-based AQP provides overall speedups in both environments, the sources of improvement differ significantly. In the on-disk DBMS, PostgreSQL, performance gains primarily come from the query plan reorderings, but not the cardinality updating mechanism; in fact, updating cardinalities introduces measurable overhead. Conversely, in the in-memory DBMS, DuckDB, cardinality refinement drives significant performance improvements for most queries. We also observe significant performance benefits of the plan-based AQP compared to a state-of-the-art related-based AQP method. These observations provide crucial insights for researchers on when and why plan-based AQP is effective, and ultimately guide database system developers on the tradeoffs between the implementation effort and performance improvements.

[Experiment, Analysis, and Benchmark] Systematic Evaluation of Plan-based Adaptive Query Processing

TL;DR

The paper tackles the persistent problem of unreliable cardinality estimation in DBMSs by systematically evaluating plan-based Adaptive Query Processing (AQP) across on-disk PostgreSQL and in-memory DuckDB. It proposes a unified, modular design to isolate and quantify the contributions of monitors, plan splitters, and sub-plan selectors, implementing these in both PostgreSQL (AQP-PostgreSQL) and DuckDB (AQP-DuckDB). The experimental results reveal that on-disk systems primarily gain from plan reorderings, while in-memory systems benefit mainly from cardinality refinement, with plan-based AQP outperforming relation-based AQP on complex multi-join workloads. These insights provide practical guidance for researchers and database engineers on when and how to apply plan-based AQP, and highlight avenues for improving split-point decision-making and executor co-design to maximize performance gains.

Abstract

Unreliable cardinality estimation remains a critical performance bottleneck in database management systems (DBMSs). Adaptive Query Processing (AQP) strategies address this limitation by providing a more robust query execution mechanism. Specifically, plan-based AQP achieves this by incrementally refining cardinality using feedback from the execution of sub-plans. However, the actual reason behind the improvements of plan-based AQP, especially across different storage architectures (on-disk vs. in-memory DBMSs), remains unexplored. This paper presents the first comprehensive analysis of state-of-the-art plan-based AQP. We implement and evaluate this strategy on both on-disk and in-memory DBMSs across two benchmarks. Our key findings reveal that while plan-based AQP provides overall speedups in both environments, the sources of improvement differ significantly. In the on-disk DBMS, PostgreSQL, performance gains primarily come from the query plan reorderings, but not the cardinality updating mechanism; in fact, updating cardinalities introduces measurable overhead. Conversely, in the in-memory DBMS, DuckDB, cardinality refinement drives significant performance improvements for most queries. We also observe significant performance benefits of the plan-based AQP compared to a state-of-the-art related-based AQP method. These observations provide crucial insights for researchers on when and why plan-based AQP is effective, and ultimately guide database system developers on the tradeoffs between the implementation effort and performance improvements.

Paper Structure

This paper contains 30 sections, 11 figures, 2 tables.

Figures (11)

  • Figure 1: An example of the QuerySplit, where the left part is the logical plan of 20c.sql from the JOB benchmark, the right part is the three sub-plans generated. The numbers beside the relations indicate which sub-plan it belongs to and also represent the order in which the sub-plan is executed.
  • Figure 2: DAG generated by the plan-based AQP method of 20c.sql from the JOB benchmark. It has three sub-plans, where the split point is the relation t.
  • Figure 3: A unified module-level representation for AQP. The white box represents the unmodified modules from the traditional DBMS, while the colored boxes denote the AQP-related modules that are newly introduced or require modification.
  • Figure 4: The modular design for plan-based AQP.
  • Figure 5: In the example of query 20c.sql from JOB, we merge the sub-plans in order from bottom to top.
  • ...and 6 more figures