Consistent Spectral Clustering in Hyperbolic Spaces
Sagar Ghosh, Swagatam Das
TL;DR
This work introduces HSCA, a spectral clustering framework operating in hyperbolic space to better capture hierarchical and tree-like data structures that challenge Euclidean representations. By embedding data into the Poincaré disc, constructing geodesic-based affinities, and performing spectral clustering on the normalized hyperbolic Laplacian, the method achieves weak consistency with a convergence behavior at least as fast as Euclidean spectral clustering. The paper also adapts hyperbolic variants of established Euclidean techniques (e.g., landmark-based HSCA-HLS K and fast variants) and provides extensive empirical validation on real and synthetic datasets, showing improved clustering quality, particularly for hierarchical data. The theoretical guarantees, along with practical algorithms and ablation studies, suggest that non-Euclidean spaces, especially hyperbolic geometry, offer a powerful framework for efficient and meaningful clustering in complex data regimes.
Abstract
Clustering, as an unsupervised technique, plays a pivotal role in various data analysis applications. Among clustering algorithms, Spectral Clustering on Euclidean Spaces has been extensively studied. However, with the rapid evolution of data complexity, Euclidean Space is proving to be inefficient for representing and learning algorithms. Although Deep Neural Networks on hyperbolic spaces have gained recent traction, clustering algorithms or non-deep machine learning models on non-Euclidean Spaces remain underexplored. In this paper, we propose a spectral clustering algorithm on Hyperbolic Spaces to address this gap. Hyperbolic Spaces offer advantages in representing complex data structures like hierarchical and tree-like structures, which cannot be embedded efficiently in Euclidean Spaces. Our proposed algorithm replaces the Euclidean Similarity Matrix with an appropriate Hyperbolic Similarity Matrix, demonstrating improved efficiency compared to clustering in Euclidean Spaces. Our contributions include the development of the spectral clustering algorithm on Hyperbolic Spaces and the proof of its weak consistency. We show that our algorithm converges at least as fast as Spectral Clustering on Euclidean Spaces. To illustrate the efficacy of our approach, we present experimental results on the Wisconsin Breast Cancer Dataset, highlighting the superior performance of Hyperbolic Spectral Clustering over its Euclidean counterpart. This work opens up avenues for utilizing non-Euclidean Spaces in clustering algorithms, offering new perspectives for handling complex data structures and improving clustering efficiency.
