The Stretto Execution Engine for LLM-Augmented Data Systems
Gabriele Sanmartino, Matthias Urban, Paolo Papotti, Carsten Binnig
TL;DR
Stretto addresses the fundamental runtime–accuracy trade-off in LLM-augmented data systems by providing end-to-end guarantees through a holistic, gradient-based optimizer that jointly selects operator implementations and budgets across a query plan. It expands the physical design space with KV-cache–enabled semantic operators, creating a dense spectrum of cost–quality trade-offs that the optimizer can exploit to meet global precision and recall targets. The architecture combines a global optimizer with offline KV cache creation and online batched execution, yielding substantial speedups over state-of-the-art baselines while maintaining probabilistic guarantees via Bayesian credible intervals. Across multimodal datasets and diverse queries, Stretto demonstrates robust target satisfaction, effective optimization of operator cascades, and significant runtime reductions, illustrating the practical viability of end-to-end quality guarantees in LLM-native data systems.
Abstract
LLM-augmented data systems enable semantic querying over structured and unstructured data, but executing queries with LLM-powered operators introduces a fundamental runtime--accuracy trade-off. In this paper, we present Stretto, a new execution engine that provides end-to-end query guarantees while efficiently navigating this trade-off in a holistic manner. For this, Stretto formulates query planning as a constrained optimization problem and uses a gradient-based optimizer to jointly select operator implementations and allocate error budgets across pipelines. Moreover, to enable fine-grained execution choices, Stretto introduces a novel idea on how KV-caching can be used to realize a spectrum of different physical operators that transform a sparse design space into a dense continuum of runtime--accuracy trade-offs. Experiments show that Stretto outperforms state-of-the-art systems while consistently meeting quality guarantees.
