HERO: Hint-Based Efficient and Reliable Query Optimizer
Sergey Zinchenko, Sergey Iazov
TL;DR
HERO introduces a reliable, efficient hint-based query optimizer that replaces opaque neural predictors with an ensemble of context-aware models plus a graph-based hint store. By semantically guiding search, incorporating Dop control, and employing a parameterized local search, HERO achieves fast inference and strong reliability, demonstrated by latency improvements on standard benchmarks and safer performance than prior NN-based methods. The work also analyzes the limitations of existing neural approaches and provides open-source datasets to accelerate evaluation of hint-based optimization strategies. Practically, HERO offers interpretable, debuggable behavior suitable for production deployment and lays a foundation for integrating refined fine-grained hints in future work.
Abstract
We propose a novel model for learned query optimization which provides query hints leading to better execution plans. The model addresses the three key challenges in learned hint-based query optimization: reliable hint recommendation (ensuring non-degradation of query latency), efficient hint exploration, and fast inference. We provide an in-depth analysis of existing NN-based approaches to hint-based optimization and experimentally confirm the named challenges for them. Our alternative solution consists of a new inference schema based on an ensemble of context-aware models and a graph storage for reliable hint suggestion and fast inference, and a budget-controlled training procedure with a local search algorithm that solves the issue of exponential search space exploration. In experiments on standard benchmarks, our model demonstrates optimization capability close to the best achievable with coarse-grained hints. Controlling the degree of parallelism (query dop) in addition to operator-related hints enables our model to achieve 3x latency improvement on JOB benchmark which sets a new standard for optimization. Our model is interpretable and easy to debug, which is particularly important for deployment in production.
