Almost Linear Time Consistent Mode Estimation and Quick Shift Clustering
Sajjad Hashemian
TL;DR
The paper tackles scalable density-based clustering in high dimensions by marrying Locality-Sensitive Hashing with the Quick Shift framework to enable approximate KDE. It introduces LSH-KDE and a Fast Quick Shift variant (LSH-QuickShift) that builds a directed clustering graph with near-linear time and space, while preserving consistency guarantees for mode estimation and point-to-mode assignments. Theoretical results quantify the estimation error and separation conditions under approximate densities, and empirical results on clustering and image segmentation demonstrate strong performance and scalability compared to established baselines. This approach offers a practical, provably consistent solution for large-scale, high-dimensional density-based clustering tasks with real-world applicability to tasks like image segmentation.
Abstract
In this paper, we propose a method for density-based clustering in high-dimensional spaces that combines Locality-Sensitive Hashing (LSH) with the Quick Shift algorithm. The Quick Shift algorithm, known for its hierarchical clustering capabilities, is extended by integrating approximate Kernel Density Estimation (KDE) using LSH to provide efficient density estimates. The proposed approach achieves almost linear time complexity while preserving the consistency of density-based clustering.
