GraphLearner: Graph Node Clustering with Fully Learnable Augmentation

Xihong Yang; Erxue Min; Ke Liang; Yue Liu; Siwei Wang; Sihang Zhou; Huijun Wu; Xinwang Liu; En Zhu

GraphLearner: Graph Node Clustering with Fully Learnable Augmentation

Xihong Yang, Erxue Min, Ke Liang, Yue Liu, Siwei Wang, Sihang Zhou, Huijun Wu, Xinwang Liu, En Zhu

TL;DR

GraphLearner tackles the reliance of graph clustering on handcrafted augmentations by introducing fully learnable structure and attribute augmentors, coupled with dual refinement via cross-view similarity and high-confidence pseudo-labels. The method jointly optimizes augmentation learning and contrastive clustering through a two-term loss L = L_a + αL_c, enabling task-specific augmentation and robust embeddings. Empirical results across six benchmarks show that GraphLearner consistently surpasses baselines, with ablations confirming the value of both augmentors and the refinement mechanisms. This approach advances graph clustering by tightly integrating augmentation learning with the clustering objective, improving both representation quality and downstream performance, and it opens doors to applying learnable augmentations to other graph-level tasks.

Abstract

Contrastive deep graph clustering (CDGC) leverages the power of contrastive learning to group nodes into different clusters. The quality of contrastive samples is crucial for achieving better performance, making augmentation techniques a key factor in the process. However, the augmentation samples in existing methods are always predefined by human experiences, and agnostic from the downstream task clustering, thus leading to high human resource costs and poor performance. To overcome these limitations, we propose a Graph Node Clustering with Fully Learnable Augmentation, termed GraphLearner. It introduces learnable augmentors to generate high-quality and task-specific augmented samples for CDGC. GraphLearner incorporates two learnable augmentors specifically designed for capturing attribute and structural information. Moreover, we introduce two refinement matrices, including the high-confidence pseudo-label matrix and the cross-view sample similarity matrix, to enhance the reliability of the learned affinity matrix. During the training procedure, we notice the distinct optimization goals for training learnable augmentors and contrastive learning networks. In other words, we should both guarantee the consistency of the embeddings as well as the diversity of the augmented samples. To address this challenge, we propose an adversarial learning mechanism within our method. Besides, we leverage a two-stage training strategy to refine the high-confidence matrices. Extensive experimental results on six benchmark datasets validate the effectiveness of GraphLearner.The code and appendix of GraphLearner are available at https://github.com/xihongyang1999/GraphLearner on Github.

GraphLearner: Graph Node Clustering with Fully Learnable Augmentation

TL;DR

Abstract

Paper Structure (21 sections, 15 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 15 equations, 3 figures, 5 tables, 1 algorithm.

Introduction
Method
Notations Definition
Fully Learnable Augmentation Module
Structure Augmentor.
Attribute Augmentor
Dual Refinement Module
Loss Function
Experiment
Experimental Setup
Performance Comparison (RQ1)
Time Cost and Memory Cost (RQ2)
Ablation Studies (RQ3)
Effectiveness of the Structure and Attribute Augmentor
Effectiveness of our learnable augmentation
...and 6 more sections

Figures (3)

Figure 1: Illustration of the fully learnable augmentation algorithm for attribute graph contrastive clustering. In our proposed algorithm, we design the learnable augmentors to to dynamically learn the structure and attribute information. Besides, we optimize the structure of the augmented view with two aspects, i.e., high-confidence clustering pseudo label matrix and cross-view similarity matrix, which integrates the clustering task and the augmentation learning into the unified framework. Moreover, we propose an adversarial learning mechanism to keep cross-view consistency in the latent space while ensuring the diversity of augmented views. Lastly, a two-stage training strategy is designed to obtain high-confidence refinement matrices, thus improving the reliability of the learned graph structure.
Figure 2: 2D $t$-SNE visualization of seven methods on two benchmark datasets. The first row and second row corresponds to CORA and AMAP dataset, respectively.
Figure 3: Sensitivity analysis of the hyper-parameter $\alpha$.

GraphLearner: Graph Node Clustering with Fully Learnable Augmentation

TL;DR

Abstract

GraphLearner: Graph Node Clustering with Fully Learnable Augmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)