Table of Contents
Fetching ...

Efficient Approximate Nearest Neighbor Search under Multi-Attribute Range Filter

Yuanhang Yu, Dawei Cheng, Ying Zhang, Lu Qin, Wenjie Zhang, Xuemin Lin

TL;DR

KHI is proposed, an index for multi-attribute RFANNS that combines an attribute-space partitioning tree with HNSW graphs attached to tree nodes that consistently achieves high query throughput while maintaining high recall.

Abstract

Nearest neighbor search on high-dimensional vectors is fundamental in modern AI and database systems. In many real-world applications, queries involve constraints on multiple numeric attributes, giving rise to range-filtering approximate nearest neighbor search (RFANNS). While there exist RFANNS indexes for single-attribute range predicates, extending them to the multi-attribute setting is nontrivial and often ineffective. In this paper, we propose KHI, an index for multi-attribute RFANNS that combines an attribute-space partitioning tree with HNSW graphs attached to tree nodes. A skew-aware splitting rule bounds the tree height by $O(\log n)$, and queries are answered by routing through the tree and running greedy search on the HNSW graphs. Experiments on four real-world datasets show that KHI consistently achieves high query throughput while maintaining high recall. Compared with the state-of-the-art RFANNS baseline, KHI improves QPS by $2.46\times$ on average and up to $16.22\times$ on the hard dataset, with larger gains for smaller selectivity, larger $k$, and higher predicate cardinality.

Efficient Approximate Nearest Neighbor Search under Multi-Attribute Range Filter

TL;DR

KHI is proposed, an index for multi-attribute RFANNS that combines an attribute-space partitioning tree with HNSW graphs attached to tree nodes that consistently achieves high query throughput while maintaining high recall.

Abstract

Nearest neighbor search on high-dimensional vectors is fundamental in modern AI and database systems. In many real-world applications, queries involve constraints on multiple numeric attributes, giving rise to range-filtering approximate nearest neighbor search (RFANNS). While there exist RFANNS indexes for single-attribute range predicates, extending them to the multi-attribute setting is nontrivial and often ineffective. In this paper, we propose KHI, an index for multi-attribute RFANNS that combines an attribute-space partitioning tree with HNSW graphs attached to tree nodes. A skew-aware splitting rule bounds the tree height by , and queries are answered by routing through the tree and running greedy search on the HNSW graphs. Experiments on four real-world datasets show that KHI consistently achieves high query throughput while maintaining high recall. Compared with the state-of-the-art RFANNS baseline, KHI improves QPS by on average and up to on the hard dataset, with larger gains for smaller selectivity, larger , and higher predicate cardinality.
Paper Structure (18 sections, 2 theorems, 7 figures, 3 tables, 5 algorithms)

This paper contains 18 sections, 2 theorems, 7 figures, 3 tables, 5 algorithms.

Key Result

Lemma 1

The height of $T$ is $O(\log \frac{n}{c_l})$.

Figures (7)

  • Figure 1: An example object set $O$ and an example query $Q$.
  • Figure 2: Running example illustrating neighbor lists of $o_4$ induced by R-tree partitions over the object set $O$.
  • Figure 3: Example of the KHI index over the object set $O$.
  • Figure 4: Overall query performance.
  • Figure 5: Evolution of distance threshold during search.
  • ...and 2 more figures

Theorems & Definitions (8)

  • Definition 1
  • Definition 2
  • Definition 3
  • Example 1
  • Example 2
  • Lemma 1
  • Example 3
  • Lemma 2