Table of Contents
Fetching ...

Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis

Abylay Amanbayev, Brian Tsan, Tri Dang, Florin Rusu

TL;DR

This work addresses the gap between FANNS algorithms and production vector databases by systematizing filtering strategies (Pre, Runtime, Post) and introducing a robust evaluation framework. It proposes Global-Local Selectivity (GLS) as a per-query measure of filter-vector independence, and introduces the MoReVec relational dataset to benchmark hybrid queries with joins. Through extensive experiments on FAISS, Milvus, and pgvector, the paper shows that architectural choices (e.g., Milvus’ hybrid search, pgvector’s optimizer) often override raw index performance, with IVFFlat outperforming HNSW under low-selectivity filters in some cases and GLS predicting recall variance. The study yields practical guidelines for index selection, parameter tuning, and plan verification, highlighting the need for selectivity-aware optimization in relational vector databases and providing an extended ANN-Benchmarks framework for future research.

Abstract

Retrieval-Augmented Generation (RAG) applications increasingly rely on Filtered Approximate Nearest Neighbor Search (FANNS) to combine semantic retrieval with metadata constraints. While algorithmic innovations for FANNS have been proposed, there remains a lack of understanding regarding how generic filtering strategies perform within Vector Databases. In this work, we systematize the taxonomy of filtering strategies and evaluate their integration into FAISS, Milvus, and pgvector. To provide a robust benchmarking framework, we introduce a new relational dataset, \textit{MoReVec}, consisting of two tables, featuring 768-dimensional text embeddings and a rich schema of metadata attributes. We further propose the \textit{Global-Local Selectivity (GLS)} correlation metric to quantify the relationship between filters and query vectors. Our experiments reveal that algorithmic adaptations within the engine often override raw index performance. Specifically, we find that: (1) \textit{Milvus} achieves superior recall stability through hybrid approximate/exact execution; (2) \textit{pgvector}'s cost-based query optimizer frequently selects suboptimal execution plans, favoring approximate index scans even when exact sequential scans would yield perfect recall at comparable latency; and (3) partition-based indexes (IVFFlat) outperform graph-based indexes (HNSW) for low-selectivity queries. To facilitate this analysis, we extend the widely-used \textit{ANN-Benchmarks} to support filtered vector search and make it available online. Finally, we synthesize our findings into a set of practical guidelines for selecting index types and configuring query optimizers for hybrid search workloads.

Filtered Approximate Nearest Neighbor Search in Vector Databases: System Design and Performance Analysis

TL;DR

This work addresses the gap between FANNS algorithms and production vector databases by systematizing filtering strategies (Pre, Runtime, Post) and introducing a robust evaluation framework. It proposes Global-Local Selectivity (GLS) as a per-query measure of filter-vector independence, and introduces the MoReVec relational dataset to benchmark hybrid queries with joins. Through extensive experiments on FAISS, Milvus, and pgvector, the paper shows that architectural choices (e.g., Milvus’ hybrid search, pgvector’s optimizer) often override raw index performance, with IVFFlat outperforming HNSW under low-selectivity filters in some cases and GLS predicting recall variance. The study yields practical guidelines for index selection, parameter tuning, and plan verification, highlighting the need for selectivity-aware optimization in relational vector databases and providing an extended ANN-Benchmarks framework for future research.

Abstract

Retrieval-Augmented Generation (RAG) applications increasingly rely on Filtered Approximate Nearest Neighbor Search (FANNS) to combine semantic retrieval with metadata constraints. While algorithmic innovations for FANNS have been proposed, there remains a lack of understanding regarding how generic filtering strategies perform within Vector Databases. In this work, we systematize the taxonomy of filtering strategies and evaluate their integration into FAISS, Milvus, and pgvector. To provide a robust benchmarking framework, we introduce a new relational dataset, \textit{MoReVec}, consisting of two tables, featuring 768-dimensional text embeddings and a rich schema of metadata attributes. We further propose the \textit{Global-Local Selectivity (GLS)} correlation metric to quantify the relationship between filters and query vectors. Our experiments reveal that algorithmic adaptations within the engine often override raw index performance. Specifically, we find that: (1) \textit{Milvus} achieves superior recall stability through hybrid approximate/exact execution; (2) \textit{pgvector}'s cost-based query optimizer frequently selects suboptimal execution plans, favoring approximate index scans even when exact sequential scans would yield perfect recall at comparable latency; and (3) partition-based indexes (IVFFlat) outperform graph-based indexes (HNSW) for low-selectivity queries. To facilitate this analysis, we extend the widely-used \textit{ANN-Benchmarks} to support filtered vector search and make it available online. Finally, we synthesize our findings into a set of practical guidelines for selecting index types and configuring query optimizers for hybrid search workloads.
Paper Structure (45 sections, 7 equations, 17 figures, 5 tables)

This paper contains 45 sections, 7 equations, 17 figures, 5 tables.

Figures (17)

  • Figure 1: Comparison between correlation metrics. (a) The distance-based correlation patel_acorn2024 is sensitive to geometric density, potentially misidentifying dense clusters as non-correlated. (b) The proposed GLS correlation normalizes for density by comparing local selectivity $\sigma_l$ to the global baseline $\sigma_g$.
  • Figure 2: Graph traversal using the pre-filtering strategy in 1-layer HNSW index with $m=3$ and search parameters $k=3$, $ef_\text{search}=3$.
  • Figure 3: Filtering strategies for ANNS.
  • Figure 4: Distribution of per-query correlation values $\rho_q$ between metadata attributes and vector embeddings at different target selectivities.
  • Figure 5: Distribution of pairwise cosine distances for 1,000,000 randomly sampled vector pairs. Left: Movie embeddings only; Middle: Review embeddings only; Right: Mixed pairs with one movie and one review embedding each.
  • ...and 12 more figures