Probabilistic Kernel Function for Fast Angle Testing
Kejing Lu, Chuan Xiao, Yoshiharu Ishikawa
TL;DR
This work tackles the problem of efficient angle testing in high-dimensional spaces for similarity search. It introduces two projection-based probabilistic kernel functions, KS1 and KS2, that rely on reference-angle concepts and deterministic projection structures to avoid Gaussian randomness and asymptotic requirements, achieving o(d) computation. The paper provides theoretical probability guarantees and a detailed complexity analysis, and demonstrates practical benefits: KS1 improves CEOs-based tasks such as k-MIPS, while KS2 enables a new probabilistic routing test that speeds up graph-based ANNS like HNSW, delivering substantial QPS gains. Together, these contributions offer a scalable, deterministic approach to fast angle testing with direct impact on fast similarity search in high dimensions.
Abstract
In this paper, we study the angle testing problem in the context of similarity search in high-dimensional Euclidean spaces and propose two projection-based probabilistic kernel functions, one designed for angle comparison and the other for angle thresholding. Unlike existing approaches that rely on random projection vectors drawn from Gaussian distributions, our approach leverages reference angles and employs a deterministic structure for the projection vectors. Notably, our kernel functions do not require asymptotic assumptions, such as the number of projection vectors tending to infinity, and can be both theoretically and experimentally shown to outperform Gaussian-distribution-based kernel functions. We apply the proposed kernel function to Approximate Nearest Neighbor Search (ANNS) and demonstrate that our approach achieves a 2.5X ~ 3X higher query-per-second (QPS) throughput compared to the widely-used graph-based search algorithm HNSW.
