Table of Contents
Fetching ...

OptBench: An Interactive Workbench for AI/ML-SQL Co-Optimization[Extended Demonstration Proposal]

Jaykumar Tandel, Douglas Oscarson, Jia Zou

TL;DR

OptBench is presented, an interactive workbench for building and benchmarking query optimizers for hybrid SQL+AI/ML queries in a transparent, apples-to-apples manner that enables practitioners and researchers to prototype optimizer ideas, inspect plan transformations, and quantitatively compare optimizer designs on multimodal inference queries within a single workbench.

Abstract

Database workloads are increasingly nesting artificial intelligence (AI) and machine learning (ML) pipelines and AI/ML model inferences with data processing, yielding hybrid SQL+AI/ML queries that mix relational operators with expensive, opaque AI/ML operators, often expressed as UDFs. These workloads are challenging to optimize because ML operators behave like black boxes, data-dependent effects such as sparsity, selectivity, and cardinalities can dominate runtime, domain experts often rely on practical heuristics that are difficult to develop with monolithic optimizers, and AI/ML operators introduce numerous co-optimization opportunities such as factorization, pushdown, ML-to-SQL conversion, and linear-algebra-to-relational-algebra rewrites, significantly enlarging the search space of equivalent execution plans. At the same time, research prototypes for SQL+ML optimization are difficult to evaluate fairly because they are typically developed on different platforms and evaluated using different queries. We present OptBench, an interactive workbench for building and benchmarking query optimizers for hybrid SQL+AI/ML queries in a transparent, apples-to-apples manner. OptBench runs all optimizers on a unified backend using DuckDB and exposes an interactive web interface that allows users to (i) construct query optimizers by leveraging and extending abstracted logical plan rewrite actions, (ii) benchmark and compare different optimizer implementations over a suite of diverse queries while recording decision traces and latency, and (iii) visualize logical plans produced by different optimizers side-by-side. The system enables practitioners and researchers to prototype optimizer ideas, inspect plan transformations, and quantitatively compare optimizer designs on multimodal inference queries within a single workbench.

OptBench: An Interactive Workbench for AI/ML-SQL Co-Optimization[Extended Demonstration Proposal]

TL;DR

OptBench is presented, an interactive workbench for building and benchmarking query optimizers for hybrid SQL+AI/ML queries in a transparent, apples-to-apples manner that enables practitioners and researchers to prototype optimizer ideas, inspect plan transformations, and quantitatively compare optimizer designs on multimodal inference queries within a single workbench.

Abstract

Database workloads are increasingly nesting artificial intelligence (AI) and machine learning (ML) pipelines and AI/ML model inferences with data processing, yielding hybrid SQL+AI/ML queries that mix relational operators with expensive, opaque AI/ML operators, often expressed as UDFs. These workloads are challenging to optimize because ML operators behave like black boxes, data-dependent effects such as sparsity, selectivity, and cardinalities can dominate runtime, domain experts often rely on practical heuristics that are difficult to develop with monolithic optimizers, and AI/ML operators introduce numerous co-optimization opportunities such as factorization, pushdown, ML-to-SQL conversion, and linear-algebra-to-relational-algebra rewrites, significantly enlarging the search space of equivalent execution plans. At the same time, research prototypes for SQL+ML optimization are difficult to evaluate fairly because they are typically developed on different platforms and evaluated using different queries. We present OptBench, an interactive workbench for building and benchmarking query optimizers for hybrid SQL+AI/ML queries in a transparent, apples-to-apples manner. OptBench runs all optimizers on a unified backend using DuckDB and exposes an interactive web interface that allows users to (i) construct query optimizers by leveraging and extending abstracted logical plan rewrite actions, (ii) benchmark and compare different optimizer implementations over a suite of diverse queries while recording decision traces and latency, and (iii) visualize logical plans produced by different optimizers side-by-side. The system enables practitioners and researchers to prototype optimizer ideas, inspect plan transformations, and quantitatively compare optimizer designs on multimodal inference queries within a single workbench.
Paper Structure (25 sections, 3 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: OptiBench System Overview
  • Figure 2: OptBench workbench overview. Users select a SQL+ML query (top-left), compare two optimizers side-by-side (top-center), and inspect the statistics, rewrite action catalog, optimizer profiles, and rule definitions (middle). The workbench also supports uploading external optimizer/action definitions (bottom-left) and benchmarking latency across the query suite (bottom-right).
  • Figure 3: Effect of the example rule-based optimizer on an inference query. Starting from the baseline plan (no optimizer), the optimizer fires a metric-driven rule that invokes the actions highlighted in the figure (e.g., MLDecompositionPushdownRewriteAction and MatMulDense2SparseRewriteAction).