Piece of CAKE: Adaptive Execution Engines via Microsecond-Scale Learning
Zijie Zhao, Ryan Marcus
TL;DR
This work tackles kernel selection for low-level database operators by introducing CAKE, a microsecond-scale contextual multi-armed bandit that leverages counterfactual feedback to learn per-morsel kernel choices. CAKE blends online learning with a global convergence plan, culminating in regret-tree compilation that enables near-zero-overhead inference once the policy stabilizes. Empirical results across IMDb, Stack, and DSB datasets show up to a 2x reduction in end-to-end latency compared to static heuristics and strong performance close to an oracle, with modest overhead and remarkable sample efficiency. The approach offers a practical, plug-in path for making DBMSs adaptive to data distributions without requiring manual per-dataset tuning, promising substantial real-world speedups for complex analytical workloads.
Abstract
Low-level database operators often admit multiple physical implementations ("kernels") that are semantically equivalent but have vastly different performance characteristics depending on the input data distribution. Existing database systems typically rely on static heuristics or worst-case optimal defaults to select these kernels, often missing significant performance opportunities. In this work, we propose CAKE (Counterfactual Adaptive Kernel Execution), a system that learns to select the optimal kernel for each data "morsel" using a microsecond-scale contextual multi-armed bandit. CAKE circumvents the high latency of traditional reinforcement learning by exploiting the cheapness of counterfactuals -- selectively running multiple kernels to obtain full feedback -- and compiling policies into low-latency regret trees. Experimentally, we show that CAKE can reduce end-to-end workload latency by up to 2x compared to state-of-the-art static heuristics.
