Table of Contents
Fetching ...

CERT: Finding Performance Issues in Database Systems Through the Lens of Cardinality Estimation

Jinsheng Ba, Manuel Rigger

TL;DR

CERT introduces a black-box testing paradigm for DBMS performance issues by exploiting cardinality estimation. By generating a query and a more restrictive version, then validating that the restricted query’s estimated cardinality is not greater (cardinality restriction monotonicity), CERT identifies potential performance bugs without executing queries. Evaluated on MySQL, TiDB, and CockroachDB, CERT found 13 unique cardinality-related issues (9 confirmed or fixed; 2 backlogged) and achieved about 386× higher throughput than prior execution-based approaches, demonstrating scalable bug-finding and practical impact for DBMS development. The approach is general, extendable with additional restriction rules, and complementary to existing benchmarking and testing techniques, offering a fast, robust avenue to improve query optimization performance.

Abstract

Database Management Systems (DBMSs) process a given query by creating a query plan, which is subsequently executed, to compute the query's result. Deriving an efficient query plan is challenging, and both academia and industry have invested decades into researching query optimization. Despite this, DBMSs are prone to performance issues, where a DBMS produces an unexpectedly inefficient query plan that might lead to the slow execution of a query. Finding such issues is a longstanding problem and inherently difficult, because no ground truth information on an expected execution time exists. In this work, we propose Cardinality Estimation Restriction Testing (CERT), a novel technique that finds performance issues through the lens of cardinality estimation. Given a query on a database, CERT derives a more restrictive query (e.g., by replacing a LEFT JOIN with an INNER JOIN), whose estimated number of rows should not exceed the estimated number of rows for the original query. CERT tests cardinality estimation specifically, because they were shown to be the most important part for query optimization; thus, we expect that finding and fixing such issues might result in the highest performance gains. In addition, we found that other kinds of query optimization issues can be exposed by unexpected estimated cardinalities, which can also be found by CERT. CERT is a black-box technique that does not require access to the source code; DBMSs expose query plans via the EXPLAIN statement. CERT eschews executing queries, which is costly and prone to performance fluctuations. We evaluated CERT on three widely used and mature DBMSs, MySQL, TiDB, and CockroachDB. CERT found 13 unique issues, of which 2 issues were fixed and 9 confirmed by the developers. We expect that this new angle on finding performance bugs will help DBMS developers in improving DMBSs' performance.

CERT: Finding Performance Issues in Database Systems Through the Lens of Cardinality Estimation

TL;DR

CERT introduces a black-box testing paradigm for DBMS performance issues by exploiting cardinality estimation. By generating a query and a more restrictive version, then validating that the restricted query’s estimated cardinality is not greater (cardinality restriction monotonicity), CERT identifies potential performance bugs without executing queries. Evaluated on MySQL, TiDB, and CockroachDB, CERT found 13 unique cardinality-related issues (9 confirmed or fixed; 2 backlogged) and achieved about 386× higher throughput than prior execution-based approaches, demonstrating scalable bug-finding and practical impact for DBMS development. The approach is general, extendable with additional restriction rules, and complementary to existing benchmarking and testing techniques, offering a fast, robust avenue to improve query optimization performance.

Abstract

Database Management Systems (DBMSs) process a given query by creating a query plan, which is subsequently executed, to compute the query's result. Deriving an efficient query plan is challenging, and both academia and industry have invested decades into researching query optimization. Despite this, DBMSs are prone to performance issues, where a DBMS produces an unexpectedly inefficient query plan that might lead to the slow execution of a query. Finding such issues is a longstanding problem and inherently difficult, because no ground truth information on an expected execution time exists. In this work, we propose Cardinality Estimation Restriction Testing (CERT), a novel technique that finds performance issues through the lens of cardinality estimation. Given a query on a database, CERT derives a more restrictive query (e.g., by replacing a LEFT JOIN with an INNER JOIN), whose estimated number of rows should not exceed the estimated number of rows for the original query. CERT tests cardinality estimation specifically, because they were shown to be the most important part for query optimization; thus, we expect that finding and fixing such issues might result in the highest performance gains. In addition, we found that other kinds of query optimization issues can be exposed by unexpected estimated cardinalities, which can also be found by CERT. CERT is a black-box technique that does not require access to the source code; DBMSs expose query plans via the EXPLAIN statement. CERT eschews executing queries, which is costly and prone to performance fluctuations. We evaluated CERT on three widely used and mature DBMSs, MySQL, TiDB, and CockroachDB. CERT found 13 unique issues, of which 2 issues were fixed and 9 confirmed by the developers. We expect that this new angle on finding performance bugs will help DBMS developers in improving DMBSs' performance.
Paper Structure (50 sections, 2 figures, 5 tables)

This paper contains 50 sections, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Overview of CERT.
  • Figure 2: The inequality relationships of estimated cardinalities in the clause with an example to join two tables.