Adaptive Graph Auto-Encoder for General Data Clustering
Xuelong Li, Hongyuan Zhang, Rui Zhang
TL;DR
AdaGAE tackles general data clustering by generatively constructing a weighted graph from data and learning embeddings with a graph auto-encoder, while adaptively updating the graph to reveal high-level structure. A key insight is that naively updating a fixed sparsity graph collapses the clustering quality; AdaGAE mitigates this by increasing the sparsity parameter $k$ over iterations and by using a distance-based decoder with KL regularization. Theoretical results explain degeneration and provide an alternative interpretation of the decoder, and spectral analysis shows the adaptive updates smooth the Laplacian, contributing to stable, scalable clustering. Empirically, AdaGAE outperforms baselines across a diverse set of text and image datasets and demonstrates robustness to initialization and data scale.
Abstract
Graph-based clustering plays an important role in the clustering area. Recent studies about graph convolution neural networks have achieved impressive success on graph type data. However, in general clustering tasks, the graph structure of data does not exist such that the strategy to construct a graph is crucial for performance. Therefore, how to extend graph convolution networks into general clustering tasks is an attractive problem. In this paper, we propose a graph auto-encoder for general data clustering, which constructs the graph adaptively according to the generative perspective of graphs. The adaptive process is designed to induce the model to exploit the high-level information behind data and utilize the non-Euclidean structure sufficiently. We further design a novel mechanism with rigorous analysis to avoid the collapse caused by the adaptive construction. Via combining the generative model for network embedding and graph-based clustering, a graph auto-encoder with a novel decoder is developed such that it performs well in weighted graph used scenarios. Extensive experiments prove the superiority of our model.
