Table of Contents
Fetching ...

Exqutor: Extended Query Optimizer for Vector-augmented Analytical Queries

Hyunjoon Kim, Chaerim Lim, Hyeonjun An, Rathijit Sen, Kwanghyun Park

TL;DR

Exqutor tackles the critical bottleneck of cardinality estimation in vector-augmented analytical queries by introducing ECQO for vector-indexed scenarios and adaptive sampling for index-less cases. Integrated into pgvector, VBASE, and DuckDB, Exqutor yields substantial speedups by informing the optimizer with precise vector predicate selectivity and reusing planning-time index results. The framework is validated across TPC-H, TPC-DS, and diverse embedding datasets, demonstrating broad applicability to multi-relational VAQs and complex workloads typical of Retrieval-Augmented Generation. Overall, Exqutor advances practical query optimization for hybrid relational-vector workloads, reducing planning and execution costs in modern data pipelines.

Abstract

Vector similarity search is becoming increasingly important for data science pipelines, particularly in Retrieval-Augmented Generation (RAG), where it enhances large language model inference by enabling efficient retrieval of relevant external knowledge. As RAG expands with table-augmented generation to incorporate structured data, workloads integrating table and vector search are becoming more prevalent. However, efficiently executing such queries remains challenging due to inaccurate cardinality estimation for vector search components, leading to suboptimal query plans. In this paper, we propose Exqutor, an extended query optimizer for vector-augmented analytical queries. Exqutor is a pluggable cardinality estimation framework designed to address this issue, leveraging exact cardinality query optimization techniques to enhance estimation accuracy when vector indexes (e.g., HNSW, IVF) are available. In scenarios lacking these indexes, we employ a sampling-based approach with adaptive sampling size adjustment, dynamically tuning the sample size to balance estimation accuracy and sampling overhead. This allows Exqutor to efficiently approximate vector search cardinalities while minimizing computational costs. We integrate our framework into pgvector, VBASE, and DuckDB, demonstrating performance improvements of up to four orders of magnitude on vector-augmented analytical queries.

Exqutor: Extended Query Optimizer for Vector-augmented Analytical Queries

TL;DR

Exqutor tackles the critical bottleneck of cardinality estimation in vector-augmented analytical queries by introducing ECQO for vector-indexed scenarios and adaptive sampling for index-less cases. Integrated into pgvector, VBASE, and DuckDB, Exqutor yields substantial speedups by informing the optimizer with precise vector predicate selectivity and reusing planning-time index results. The framework is validated across TPC-H, TPC-DS, and diverse embedding datasets, demonstrating broad applicability to multi-relational VAQs and complex workloads typical of Retrieval-Augmented Generation. Overall, Exqutor advances practical query optimization for hybrid relational-vector workloads, reducing planning and execution costs in modern data pipelines.

Abstract

Vector similarity search is becoming increasingly important for data science pipelines, particularly in Retrieval-Augmented Generation (RAG), where it enhances large language model inference by enabling efficient retrieval of relevant external knowledge. As RAG expands with table-augmented generation to incorporate structured data, workloads integrating table and vector search are becoming more prevalent. However, efficiently executing such queries remains challenging due to inaccurate cardinality estimation for vector search components, leading to suboptimal query plans. In this paper, we propose Exqutor, an extended query optimizer for vector-augmented analytical queries. Exqutor is a pluggable cardinality estimation framework designed to address this issue, leveraging exact cardinality query optimization techniques to enhance estimation accuracy when vector indexes (e.g., HNSW, IVF) are available. In scenarios lacking these indexes, we employ a sampling-based approach with adaptive sampling size adjustment, dynamically tuning the sample size to balance estimation accuracy and sampling overhead. This allows Exqutor to efficiently approximate vector search cardinalities while minimizing computational costs. We integrate our framework into pgvector, VBASE, and DuckDB, demonstrating performance improvements of up to four orders of magnitude on vector-augmented analytical queries.

Paper Structure

This paper contains 15 sections, 6 equations, 14 figures, 2 tables.

Figures (14)

  • Figure 1: Extended RAG pipeline integrating vector search with structured data. This four-stage pipeline retrieves structured and vector-based contexts to generate informed responses to user prompts. (1) The user provides a prompt, requesting optimal discount rate. (2) The user transforms the prompt into a VAQ to retrieve the relevant structured and vector data. (3) The retrieved contexts, along with the user prompt, are provided as input to the LLM. (4) The LLM generates a response, delivering analytical insights based on the combined structured and vector-based retrieval results.
  • Figure 2: Execution time and generated query plan for the VAQ in \ref{['lst:rag_query']} on pgvector. The optimal plan is generated by query optimizer with true cardinality of vector similarity search. The ps_embedding column in the partsupp table (80M) has vector embeddings from the DEEP dataset.
  • Figure 3: Integration of Exqutor into a generalized vector database system. When a VAQ is processed, the original query plan is forwarded to Exqutor (➊), which calculates vector cardinality using ECQO or sampling-based cardinality estimation (➋). The estimated cardinality for vector range search is then returned to the query optimizer, allowing it to generate a more accurate and efficient execution plan (➌).
  • Figure 4: Query execution time for TPC-H VAQs with a vector index using three different datasets (SF100). Each subfigure compares query latency with and without Exqutor integration in pgvector, VBASE and DuckDB.
  • Figure 5: Query execution time on pgvector for TPC-H VAQs without vector index (SF100). The fixed sample size uses a constant sample size of $N = 385$, whereas the adaptive sampling strategy dynamically adjusts the sample size based on Q-error.
  • ...and 9 more figures