SACA: Selective Attention-Based Clustering Algorithm
Meysam Shirdel Bilehsavar, Razieh Ghaedi, Samira Seyed Taheri, Xinqi Fan, Christian O'Reilly
TL;DR
The paper tackles the challenge of parameter tuning in density-based clustering by introducing SACA, a Selective Attention-Based Clustering Algorithm that derives a data-driven pruning threshold to isolate high-density cluster cores and then reintegrates sparse points. It combines a principled core-extraction phase with flexible reintegration (via nearest-neighbor or centroid-based assignment) and an intuitive Attention Selectivity Coefficient for multi-level pattern discovery. Across 16 benchmark datasets and six metrics, SACA demonstrates robust, accurate performance with minimal tuning, outperforming DBSCAN, HDBSCAN, and OPTICS in many scenarios. The approach emphasizes interpretability and practicality, offering memory-time trade-offs (including a KD-tree variant) and outlining directions for memory-efficient and noise-adaptive extensions.
Abstract
Clustering algorithms are fundamental tools across many fields, with density-based methods offering particular advantages in identifying arbitrarily shaped clusters and handling noise. However, their effectiveness is often limited by the requirement of critical parameter tuning by users, which typically requires significant domain expertise. This paper introduces a novel density-based clustering algorithm loosely inspired by the concept of selective attention, designed to minimize reliance on parameter tuning for most applications. The proposed method computes an adaptive threshold to exclude sparsely distributed points and outliers, constructs an initial cluster framework, and subsequently reintegrates the filtered points to refine the final results. Extensive experiments on diverse benchmark datasets demonstrate the robustness, accuracy, and ease of use of the proposed approach, establishing it as a powerful alternative to conventional density-based clustering techniques.
