Table of Contents
Fetching ...

Efficient Query Rewrite Rule Discovery via Standardized Enumeration and Learning-to-Rank

Yuan Zhang, Yuxing Chen, Yuekun Yu, Jinbin Huang, Rui Mao, Anqun Pan, Lixiong Zheng, Jianbin Qin

TL;DR

SLER is a scalable system that enables efficient and effective rewrite rule discovery by combining standardized template enumeration with a learning to rank approach, paving the way for next generation optimizers powered by comprehensive, adaptive rule spaces.

Abstract

Query rewriting is essential for database performance optimization, but existing automated rule enumeration methods suffer from exponential search spaces, severe redundancy, and poor scalability, especially when handling complex query plans with five or more nodes, where a node represents an operator in the plan tree. We present SLER, a scalable system that enables efficient and effective rewrite rule discovery by combining standardized template enumeration with a learning to rank approach. SLER uses standardized templates, abstractions of query plans with operator structures preserved but data specific details removed, to eliminate structural redundancies and drastically reduce the search space. A learn to rank model guides enumeration by pre filtering the most promising template pairs, enabling scalable rule generation for large node templates. Evaluated on over 11000 real world SQL queries from both open source and commercial workloads, SLER has automatically constructed a rewrite rule repository exceeding 1 million rules - the largest empirically validated rewrite rule library to date. Notably, at the scale of one million rules, SLER supports query plan templates with complexity up to channel level depth. This unprecedented scale opens the door to discovering highly intricate transformations across diverse query patterns. Critically, SLER's template driven design and learned ranking mechanism are inherently extensible, allowing seamless integration of new and complex operators, paving the way for next generation optimizers powered by comprehensive, adaptive rule spaces.

Efficient Query Rewrite Rule Discovery via Standardized Enumeration and Learning-to-Rank

TL;DR

SLER is a scalable system that enables efficient and effective rewrite rule discovery by combining standardized template enumeration with a learning to rank approach, paving the way for next generation optimizers powered by comprehensive, adaptive rule spaces.

Abstract

Query rewriting is essential for database performance optimization, but existing automated rule enumeration methods suffer from exponential search spaces, severe redundancy, and poor scalability, especially when handling complex query plans with five or more nodes, where a node represents an operator in the plan tree. We present SLER, a scalable system that enables efficient and effective rewrite rule discovery by combining standardized template enumeration with a learning to rank approach. SLER uses standardized templates, abstractions of query plans with operator structures preserved but data specific details removed, to eliminate structural redundancies and drastically reduce the search space. A learn to rank model guides enumeration by pre filtering the most promising template pairs, enabling scalable rule generation for large node templates. Evaluated on over 11000 real world SQL queries from both open source and commercial workloads, SLER has automatically constructed a rewrite rule repository exceeding 1 million rules - the largest empirically validated rewrite rule library to date. Notably, at the scale of one million rules, SLER supports query plan templates with complexity up to channel level depth. This unprecedented scale opens the door to discovering highly intricate transformations across diverse query patterns. Critically, SLER's template driven design and learned ranking mechanism are inherently extensible, allowing seamless integration of new and complex operators, paving the way for next generation optimizers powered by comprehensive, adaptive rule spaces.
Paper Structure (21 sections, 7 equations, 5 figures, 4 tables, 2 algorithms)

This paper contains 21 sections, 7 equations, 5 figures, 4 tables, 2 algorithms.

Figures (5)

  • Figure 1: Example of a rule model.
  • Figure 2: System architecture of SLER.
  • Figure 3: Architecture of standardized rule enumeration.
  • Figure 4: A case for redundancy rules.
  • Figure 5: Architecture of rule effectiveness ranking.

Theorems & Definitions (5)

  • Example 1
  • Definition 1: Query Rewrite Rule
  • Definition 2: Small-to-Large Composition
  • Definition 3: Proof Length and Optimality
  • Definition 4