GraphLearner: Graph Node Clustering with Fully Learnable Augmentation
Xihong Yang, Erxue Min, Ke Liang, Yue Liu, Siwei Wang, Sihang Zhou, Huijun Wu, Xinwang Liu, En Zhu
TL;DR
GraphLearner tackles the reliance of graph clustering on handcrafted augmentations by introducing fully learnable structure and attribute augmentors, coupled with dual refinement via cross-view similarity and high-confidence pseudo-labels. The method jointly optimizes augmentation learning and contrastive clustering through a two-term loss L = L_a + αL_c, enabling task-specific augmentation and robust embeddings. Empirical results across six benchmarks show that GraphLearner consistently surpasses baselines, with ablations confirming the value of both augmentors and the refinement mechanisms. This approach advances graph clustering by tightly integrating augmentation learning with the clustering objective, improving both representation quality and downstream performance, and it opens doors to applying learnable augmentations to other graph-level tasks.
Abstract
Contrastive deep graph clustering (CDGC) leverages the power of contrastive learning to group nodes into different clusters. The quality of contrastive samples is crucial for achieving better performance, making augmentation techniques a key factor in the process. However, the augmentation samples in existing methods are always predefined by human experiences, and agnostic from the downstream task clustering, thus leading to high human resource costs and poor performance. To overcome these limitations, we propose a Graph Node Clustering with Fully Learnable Augmentation, termed GraphLearner. It introduces learnable augmentors to generate high-quality and task-specific augmented samples for CDGC. GraphLearner incorporates two learnable augmentors specifically designed for capturing attribute and structural information. Moreover, we introduce two refinement matrices, including the high-confidence pseudo-label matrix and the cross-view sample similarity matrix, to enhance the reliability of the learned affinity matrix. During the training procedure, we notice the distinct optimization goals for training learnable augmentors and contrastive learning networks. In other words, we should both guarantee the consistency of the embeddings as well as the diversity of the augmented samples. To address this challenge, we propose an adversarial learning mechanism within our method. Besides, we leverage a two-stage training strategy to refine the high-confidence matrices. Extensive experimental results on six benchmark datasets validate the effectiveness of GraphLearner.The code and appendix of GraphLearner are available at https://github.com/xihongyang1999/GraphLearner on Github.
