Hype or Heuristic? Quantum Reinforcement Learning for Join Order Optimisation

Maja Franz; Tobias Winker; Sven Groppe; Wolfgang Mauerer

Hype or Heuristic? Quantum Reinforcement Learning for Join Order Optimisation

Maja Franz, Tobias Winker, Sven Groppe, Wolfgang Mauerer

TL;DR

The study investigates quantum reinforcement learning (QRL) for join order (JO) optimization in database systems, comparing against a classical RL baseline and a single-step QML method. It proposes a multi-step QRL framework using a hybrid variational quantum circuit with reduced input encoding to handle bushy join trees while using far fewer qubits and trainable parameters. Across JOB benchmark simulations, QRL matches or nears classical performance in result quality and can outperform single-step QML by up to 17% in median cost under ideal conditions, though current hardware noise limits practical advantage. The work delivers open-source tooling and provides a nuanced assessment of quantum advantages, highlighting parameter efficiency and scalability as promising benefits for dynamic, low-latency JO scenarios and outlining directions for future hardware and encoding improvements.

Abstract

Identifying optimal join orders (JOs) stands out as a key challenge in database research and engineering. Owing to the large search space, established classical methods rely on approximations and heuristics. Recent efforts have successfully explored reinforcement learning (RL) for JO. Likewise, quantum versions of RL have received considerable scientific attention. Yet, it is an open question if they can achieve sustainable, overall practical advantages with improved quantum processors. In this paper, we present a novel approach that uses quantum reinforcement learning (QRL) for JO based on a hybrid variational quantum ansatz. It is able to handle general bushy join trees instead of resorting to simpler left-deep variants as compared to approaches based on quantum(-inspired) optimisation, yet requires multiple orders of magnitudes fewer qubits, which is a scarce resource even for post-NISQ systems. Despite moderate circuit depth, the ansatz exceeds current NISQ capabilities, which requires an evaluation by numerical simulations. While QRL may not significantly outperform classical approaches in solving the JO problem with respect to result quality (albeit we see parity), we find a drastic reduction in required trainable parameters. This benefits practically relevant aspects ranging from shorter training times compared to classical RL, less involved classical optimisation passes, or better use of available training data, and fits data-stream and low-latency processing scenarios. Our comprehensive evaluation and careful discussion delivers a balanced perspective on possible practical quantum advantage, provides insights for future systemic approaches, and allows for quantitatively assessing trade-offs of quantum approaches for one of the most crucial problems of database management systems.

Hype or Heuristic? Quantum Reinforcement Learning for Join Order Optimisation

TL;DR

Abstract

Paper Structure (41 sections, 4 equations, 10 figures)

This paper contains 41 sections, 4 equations, 10 figures.

Introduction
Related Work
Preliminaries
Background on the Join Order Problem
Query
Join Tree
Cost Functions
Background on Reinforcement Learning
Background on Quantum Machine Learning
Data Encoding
Data Decoding
Methodology
Single-Step versus Multi-Step Join Ordering
Single-Step QML
Multi-Step QRL
...and 26 more sections

Figures (10)

Figure 1: Single-step versus multi-step approach presented in an RL fashion. Here, $A$ to $F$ are the relations to join. We neglect selection predicates.
Figure 2: Interplay between data encoding (top) and variational quantum circuit (bottom) processing in our approach. Starting from the query and the baseline encoding of Ref. marcus18, we prune unnecessary features and flatten the core input data into a vector that is statically fed into the encoding quantum gates $\hat{U}_{\text{enc}}$. The variational quantum circuit (using a configurable number of qubits) is initialised with qubits in state $\ket{0}$, and iteratively executes block of intermingled encoding and variational ($\hat{U}_{\text{var}}$) gates; following a measurement, a classical optimisation procedure delivers new parameter estimates for the variational gates, and the updated circuit is iteratively re-executed. Following established conventions, solid lines indicate quantum information, double lines concern classical information (measurement results that may change in each run of the quantum circuit), and dashed lines represent parameters that are statically fed into the quantum circuit (remaining constant across circuit runs). Grey, thick lines symbolise logical flow.
Figure 3: Processing sequence to iteratively determine join orders. Once the query has been parsed and encoded, subsequent invocations of the variational quantum circuit as illustrated in \ref{['fig:overview']}, determine more and more joins, until a complete order has been found.
Figure 4: Details of quantum state manipulation: Parametrised rotations around the $x$ axis ($\hat{R}_{x}(\theta)$) encode information. The variational part comprises parametrised rotations around the $y$ and $z$ axes, implemented by $\hat{R}_{y}(\gamma)$ and $\hat{R}_{z} (\delta)$, followed by a cyclic sequence of $\text{C--}\!\hat{Z}$ gates that create entanglement.
Figure 5: Relative cost median during training. For the methods involving a quantum part the models with DRU and 20 variational layers are depicted.
...and 5 more figures

Hype or Heuristic? Quantum Reinforcement Learning for Join Order Optimisation

TL;DR

Abstract

Hype or Heuristic? Quantum Reinforcement Learning for Join Order Optimisation

Authors

TL;DR

Abstract

Table of Contents

Figures (10)