Table of Contents
Fetching ...

GateANN: I/O-Efficient Filtered Vector Search on SSDs

Nakyung Lee, Soobin Cho, Jiwoong Park, Gyuyeong Kim

Abstract

We present GateANN, an I/O-efficient SSD-based graph ANNS system that supports filtered vector search on an unmodified graph index. Existing SSD-based systems either waste I/O by post-filtering, or require expensive filter-aware index rebuilds. GateANN avoids both by decoupling graph traversal from vector retrieval. Our key insight is that traversing a node requires only its neighbor list and an approximate distance, neither of which needs the full-precision vector on SSD. Based on this, GateANN introduces graph tunneling. It checks each node's filter predicate in memory before issuing I/O and routes through non-matching nodes entirely in memory, preserving graph connectivity without any SSD read for non-matching nodes. Our experimental results show that it reduces SSD reads by up to 10x and improves throughput by up to 7.6x.

GateANN: I/O-Efficient Filtered Vector Search on SSDs

Abstract

We present GateANN, an I/O-efficient SSD-based graph ANNS system that supports filtered vector search on an unmodified graph index. Existing SSD-based systems either waste I/O by post-filtering, or require expensive filter-aware index rebuilds. GateANN avoids both by decoupling graph traversal from vector retrieval. Our key insight is that traversing a node requires only its neighbor list and an approximate distance, neither of which needs the full-precision vector on SSD. Based on this, GateANN introduces graph tunneling. It checks each node's filter predicate in memory before issuing I/O and routes through non-matching nodes entirely in memory, preserving graph connectivity without any SSD read for non-matching nodes. Our experimental results show that it reduces SSD reads by up to 10x and improves throughput by up to 7.6x.
Paper Structure (34 sections, 1 equation, 18 figures, 5 tables, 1 algorithm)

This paper contains 34 sections, 1 equation, 18 figures, 5 tables, 1 algorithm.

Figures (18)

  • Figure 1: Motivating experiments on BigANN-100M with 10% selectivity. (a) Post-filtering systems plateau early because of wasted per-node work. (b) Naïve pre-filtering destroys graph connectivity, degrading throughput and recall.
  • Figure 2: Different approaches for filtered vector search.
  • Figure 3: GateANN overview. A candidate is first checked against the in-memory filter store. Filter-passing nodes follow the normal SSD path: an asynchronous read followed by exact distance computation. Filter-failing nodes are routed to the graph tunneling path: the neighbor store provides the adjacency list, PQ distances are computed in memory, and promising neighbors are inserted back into the candidate list. Both paths feed into the same sorted frontier.
  • Figure 4: Graph tunneling example. ➊ Node $B$ passes the filter and follows the normal SSD path. ➋ Node $C$ is selected next but fails the filter check. ➌ GateANN reads $C$'s neighbors ($N_1$--$N_4$) from the neighbor store in memory. ➍ PQ distances to the query are computed for each neighbor; neighbors below the frontier threshold (here $0.45$) are kept ($N_1$, $N_3$) and the rest ($N_2$, $N_4$) are discarded. ➎ Promising neighbors are inserted into the candidate list, and $C$ is marked visited but ineligible for results.
  • Figure 5: Recall--latency (1 thread) and throughput--recall (32 threads) tradeoff curves on 100M-scale datasets. GateANN consistently outperforms both PipeANN and DiskANN across the entire recall range.
  • ...and 13 more figures