Table of Contents
Fetching ...

Enhanced Differential Testing in Emerging Database Systems

Yuancheng Jiang, Jianing Wang, Chuqi Zhang, Roland Yap, Zhenkai Liang, Manuel Rigger

TL;DR

This work targets the reliability of emerging SQL-like database systems by addressing logic bugs that are hard to detect with traditional fuzzing. It introduces SQLxDiff, an enhanced differential testing framework that leverages a mature relational database as the reference, identifies shared clauses, maps features from emerging systems back to relational equivalents, and generates semantically equivalent but syntactically different queries to uncover discrepancies. Across QuestDB, TDEngine, RisingWave, and CrateDB, SQLxDiff found 57 unknown bugs (17 logic, 40 internal), with 50 fixed and 5 confirmed, demonstrating substantial gains in bug detection, query-plan diversity, and execution success. The approach emphasizes manual yet scalable clause mappings, supported by an extensible pipeline and potential automation via language models, offering a practical pathway to improve robustness of evolving database platforms.

Abstract

In recent years, a plethora of database management systems have surfaced to meet the demands of various scenarios. Emerging database systems, such as time-series and streaming database systems, are tailored to specific use cases requiring enhanced functionality and performance. However, as they are typically less mature, there can be bugs that either cause incorrect results or errors impacting reliability. To tackle this, we propose enhanced differential testing to uncover various bugs in emerging SQL-like database systems. The challenge is how to deal with differences of these emerging databases. Our insight is that many emerging database systems are conceptually extensions of relational database systems, making it possible to reveal logic bugs leveraging existing relational, known-reliable database systems. However, due to inevitable syntax or semantics gaps, it remains challenging to scale differential testing to various emerging database systems. We enhance differential testing for emerging database systems with three steps: (i) identifying shared clauses; (ii) extending shared clauses via mapping new features back to existing clauses of relational database systems; and (iii) generating differential inputs using extended shared clauses. We implemented our approach in a tool called SQLxDiff and applied it to four popular emerging database systems. In total, we found 57 unknown bugs, of which 17 were logic bugs and 40 were internal errors. Overall, vendors fixed 50 bugs and confirmed 5. Our results demonstrate the practicality and effectiveness of SQLxDiff in detecting bugs in emerging database systems, which has the potential to improve the reliability of their applications.

Enhanced Differential Testing in Emerging Database Systems

TL;DR

This work targets the reliability of emerging SQL-like database systems by addressing logic bugs that are hard to detect with traditional fuzzing. It introduces SQLxDiff, an enhanced differential testing framework that leverages a mature relational database as the reference, identifies shared clauses, maps features from emerging systems back to relational equivalents, and generates semantically equivalent but syntactically different queries to uncover discrepancies. Across QuestDB, TDEngine, RisingWave, and CrateDB, SQLxDiff found 57 unknown bugs (17 logic, 40 internal), with 50 fixed and 5 confirmed, demonstrating substantial gains in bug detection, query-plan diversity, and execution success. The approach emphasizes manual yet scalable clause mappings, supported by an extensible pipeline and potential automation via language models, offering a practical pathway to improve robustness of evolving database platforms.

Abstract

In recent years, a plethora of database management systems have surfaced to meet the demands of various scenarios. Emerging database systems, such as time-series and streaming database systems, are tailored to specific use cases requiring enhanced functionality and performance. However, as they are typically less mature, there can be bugs that either cause incorrect results or errors impacting reliability. To tackle this, we propose enhanced differential testing to uncover various bugs in emerging SQL-like database systems. The challenge is how to deal with differences of these emerging databases. Our insight is that many emerging database systems are conceptually extensions of relational database systems, making it possible to reveal logic bugs leveraging existing relational, known-reliable database systems. However, due to inevitable syntax or semantics gaps, it remains challenging to scale differential testing to various emerging database systems. We enhance differential testing for emerging database systems with three steps: (i) identifying shared clauses; (ii) extending shared clauses via mapping new features back to existing clauses of relational database systems; and (iii) generating differential inputs using extended shared clauses. We implemented our approach in a tool called SQLxDiff and applied it to four popular emerging database systems. In total, we found 57 unknown bugs, of which 17 were logic bugs and 40 were internal errors. Overall, vendors fixed 50 bugs and confirmed 5. Our results demonstrate the practicality and effectiveness of SQLxDiff in detecting bugs in emerging database systems, which has the potential to improve the reliability of their applications.
Paper Structure (36 sections, 6 figures, 2 tables)

This paper contains 36 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Popularity Trend of Database Systems Past Decade
  • Figure 2: Approach Overview
  • Figure 3: How Our Approach Improves Differential Testing
  • Figure 4: SQLxDiff's Improvement on Unique Query Plans
  • Figure 5: Comparison of Unique Query Plans
  • ...and 1 more figures