Table of Contents
Fetching ...

Adaptive Prefiltering for High-Dimensional Similarity Search: A Frequency-Aware Approach

Teodor-Ioan Calin

TL;DR

This work presents an adaptive prefiltering framework that leverages query frequency patterns and cluster coherence metrics to dynamically allocate computational budgets and introduces minimal overhead through lightweight frequency tracking and provides graceful degradation for unseen queries through coherence-based fallback policies.

Abstract

High-dimensional similarity search underpins modern retrieval systems, yet uniform search strategies fail to exploit the heterogeneous nature of real-world query distributions. We present an adaptive prefiltering framework that leverages query frequency patterns and cluster coherence metrics to dynamically allocate computational budgets. Our approach partitions the query space into frequency tiers following Zipfian distributions and assigns differentiated search policies based on historical access patterns and local density characteristics. Experiments on ImageNet-1k using CLIP embeddings demonstrate that frequency-aware budget allocation achieves equivalent recall with 20.4% fewer distance computations compared to static nprobe selection, while maintaining sub-millisecond latency on GPU-accelerated FAISS indices. The framework introduces minimal overhead through lightweight frequency tracking and provides graceful degradation for unseen queries through coherence-based fallback policies.

Adaptive Prefiltering for High-Dimensional Similarity Search: A Frequency-Aware Approach

TL;DR

This work presents an adaptive prefiltering framework that leverages query frequency patterns and cluster coherence metrics to dynamically allocate computational budgets and introduces minimal overhead through lightweight frequency tracking and provides graceful degradation for unseen queries through coherence-based fallback policies.

Abstract

High-dimensional similarity search underpins modern retrieval systems, yet uniform search strategies fail to exploit the heterogeneous nature of real-world query distributions. We present an adaptive prefiltering framework that leverages query frequency patterns and cluster coherence metrics to dynamically allocate computational budgets. Our approach partitions the query space into frequency tiers following Zipfian distributions and assigns differentiated search policies based on historical access patterns and local density characteristics. Experiments on ImageNet-1k using CLIP embeddings demonstrate that frequency-aware budget allocation achieves equivalent recall with 20.4% fewer distance computations compared to static nprobe selection, while maintaining sub-millisecond latency on GPU-accelerated FAISS indices. The framework introduces minimal overhead through lightweight frequency tracking and provides graceful degradation for unseen queries through coherence-based fallback policies.
Paper Structure (27 sections, 2 theorems, 4 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 27 sections, 2 theorems, 4 equations, 1 figure, 1 table, 1 algorithm.

Key Result

Proposition 1

For clusters formed from embeddings of a contrastively-trained model, the expected coherence $\mathbb{E}[\rho(\mathcal{C}_i)]$ scales with the training frequency $f_i$ of concepts in cluster $i$ as: for some $\alpha > 0$ determined by the training dynamics.

Figures (1)

  • Figure 1: Pareto frontier of search efficiency. The adaptive strategy (green) achieves higher recall for equivalent cost in the critical operating regions. At moderate costs (200-600 vectors visited), the adaptive approach yields superior performance.

Theorems & Definitions (4)

  • Definition 1: Search Cost
  • Definition 2: Cluster Coherence
  • Proposition 1: Frequency-Coherence Power Law
  • Theorem 1: Heterogeneous Efficiency