Spectral Toolkit of Algorithms for Graphs: Technical Report (2)
Peter Macgregor, He Sun
TL;DR
STAG 2.0 extends the open-source graph-analysis toolkit with three scalable components: Euclidean Locality Sensitive Hashing for approximate nearest neighbors, CKNS-based Gaussian Kernel Density Estimation for fast density queries, and an MS-based fast spectral clustering pipeline. The report provides a comprehensive user guide, practical API descriptions, and demonstrations that highlight how CKNS KDE enables efficient similarity graph construction and scalable clustering on large datasets. It discusses design decisions, parameter choices, and performance comparisons, illustrating STAG's applicability to large-scale graph-based data analysis in both C++ and Python. The integrated approach reduces the computational burden of traditional fully connected graphs while preserving clustering structure and providing theoretical guarantees where applicable.
Abstract
Spectral Toolkit of Algorithms for Graphs (STAG) is an open-source library for efficient graph algorithms. This technical report presents the newly implemented component on locality sensitive hashing, kernel density estimation, and fast spectral clustering. The report includes a user's guide to the newly implemented algorithms, experiments and demonstrations of the new functionality, and several technical considerations behind our development.
