Table of Contents
Fetching ...

A Survey on Locality Sensitive Hashing Algorithms and their Applications

Omid Jafari, Preeti Maurya, Parth Nagarkar, Khandker Mushfiqul Islam, Chidambaram Crushev

TL;DR

This survey targets the ANN problem in high-dimensional spaces, arguing for Locality Sensitive Hashing as a scalable, theory-backed approach. It classifies LSH techniques by distance metrics, surveys numerous improvements (multi-probe, data-dependent, kernelized, Bayesian), and details distributed frameworks that boost throughput. A major contribution is the comprehensive catalog of applications across audio, image/video, security/privacy, biology, geoscience, graphs, time series, healthcare, and robotics, illustrating LSH’s practical impact. The paper thus provides both methodological guidance and domain-specific workflows to implement efficient, approximate nearest neighbor search in real-world settings.

Abstract

Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many diverse application domains. Locality Sensitive Hashing (LSH) is one of the most popular techniques for finding approximate nearest neighbor searches in high-dimensional spaces. The main benefits of LSH are its sub-linear query performance and theoretical guarantees on the query accuracy. In this survey paper, we provide a review of state-of-the-art LSH and Distributed LSH techniques. Most importantly, unlike any other prior survey, we present how Locality Sensitive Hashing is utilized in different application domains.

A Survey on Locality Sensitive Hashing Algorithms and their Applications

TL;DR

This survey targets the ANN problem in high-dimensional spaces, arguing for Locality Sensitive Hashing as a scalable, theory-backed approach. It classifies LSH techniques by distance metrics, surveys numerous improvements (multi-probe, data-dependent, kernelized, Bayesian), and details distributed frameworks that boost throughput. A major contribution is the comprehensive catalog of applications across audio, image/video, security/privacy, biology, geoscience, graphs, time series, healthcare, and robotics, illustrating LSH’s practical impact. The paper thus provides both methodological guidance and domain-specific workflows to implement efficient, approximate nearest neighbor search in real-world settings.

Abstract

Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many diverse application domains. Locality Sensitive Hashing (LSH) is one of the most popular techniques for finding approximate nearest neighbor searches in high-dimensional spaces. The main benefits of LSH are its sub-linear query performance and theoretical guarantees on the query accuracy. In this survey paper, we provide a review of state-of-the-art LSH and Distributed LSH techniques. Most importantly, unlike any other prior survey, we present how Locality Sensitive Hashing is utilized in different application domains.

Paper Structure

This paper contains 38 sections.