Table of Contents
Fetching ...

Structure-enhanced Contrastive Learning for Graph Clustering

Xunlian Wu, Jingqi Hu, Anqi Zhang, Yining Quan, Qiguang Miao, Peng Gang Sun

TL;DR

This work addresses graph clustering without heavy reliance on data augmentation and without ignoring mesoscopic community structure. It introduces Structure-enhanced Contrastive Learning (SECL), combining cross-view contrastive learning over structure and attribute views with a structural alignment loss and a modularity-maximization objective to preserve community structure. Through two MLP encoders and a learnable mapping to clusters, SECL achieves state-of-the-art results across six diverse datasets, outperforming both traditional deep clustering methods and recent contrastive approaches. The approach reduces the need for pre-training and cumbersome augmentations while delivering robust, cluster-aware representations with practical implications for real-world network analysis.

Abstract

Graph clustering is a crucial task in network analysis with widespread applications, focusing on partitioning nodes into distinct groups with stronger intra-group connections than inter-group ones. Recently, contrastive learning has achieved significant progress in graph clustering. However, most methods suffer from the following issues: 1) an over-reliance on meticulously designed data augmentation strategies, which can undermine the potential of contrastive learning. 2) overlooking cluster-oriented structural information, particularly the higher-order cluster(community) structure information, which could unveil the mesoscopic cluster structure information of the network. In this study, Structure-enhanced Contrastive Learning (SECL) is introduced to addresses these issues by leveraging inherent network structures. SECL utilizes a cross-view contrastive learning mechanism to enhance node embeddings without elaborate data augmentations, a structural contrastive learning module for ensuring structural consistency, and a modularity maximization strategy for harnessing clustering-oriented information. This comprehensive approach results in robust node representations that greatly enhance clustering performance. Extensive experiments on six datasets confirm SECL's superiority over current state-of-the-art methods, indicating a substantial improvement in the domain of graph clustering.

Structure-enhanced Contrastive Learning for Graph Clustering

TL;DR

This work addresses graph clustering without heavy reliance on data augmentation and without ignoring mesoscopic community structure. It introduces Structure-enhanced Contrastive Learning (SECL), combining cross-view contrastive learning over structure and attribute views with a structural alignment loss and a modularity-maximization objective to preserve community structure. Through two MLP encoders and a learnable mapping to clusters, SECL achieves state-of-the-art results across six diverse datasets, outperforming both traditional deep clustering methods and recent contrastive approaches. The approach reduces the need for pre-training and cumbersome augmentations while delivering robust, cluster-aware representations with practical implications for real-world network analysis.

Abstract

Graph clustering is a crucial task in network analysis with widespread applications, focusing on partitioning nodes into distinct groups with stronger intra-group connections than inter-group ones. Recently, contrastive learning has achieved significant progress in graph clustering. However, most methods suffer from the following issues: 1) an over-reliance on meticulously designed data augmentation strategies, which can undermine the potential of contrastive learning. 2) overlooking cluster-oriented structural information, particularly the higher-order cluster(community) structure information, which could unveil the mesoscopic cluster structure information of the network. In this study, Structure-enhanced Contrastive Learning (SECL) is introduced to addresses these issues by leveraging inherent network structures. SECL utilizes a cross-view contrastive learning mechanism to enhance node embeddings without elaborate data augmentations, a structural contrastive learning module for ensuring structural consistency, and a modularity maximization strategy for harnessing clustering-oriented information. This comprehensive approach results in robust node representations that greatly enhance clustering performance. Extensive experiments on six datasets confirm SECL's superiority over current state-of-the-art methods, indicating a substantial improvement in the domain of graph clustering.
Paper Structure (21 sections, 19 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 21 sections, 19 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of SECL applied to graph clustering. In the cross-view contrastive module, structure and attributes are first embedded into the latent space using the structure encoder $\text{MLP}^{(1)}$ and attribute encoder $\text{MLP}^{(2)}$, thereby bypassing the need for complex data augmentation. Subsequently, the similarity between the attribute and structure embeddings is computed to derive the cross-view contrastive loss. Next, within the structure contrastive loss module, consistency of structure information is ensured by aligning the similarity matrix with the neighboring structure information. Then, a modularity maximization module is employed to capture cluster-oriented information. Finally, we jointly optimize the cost functions of three modules using the Adam optimizer. Post-optimization, the graph clustering results are obtained by applying K-means to the attribute embeddings $\textbf{H}^{(2)}$.
  • Figure 2: The performance of SECL with different hyper-parameter $\lambda_1$ and $\lambda_2$ on six datasets.
  • Figure 3: Ablation comparisons of SECL on six datasets. (a), (b), (c), (d), (e) and (f) represent the results on CORA, CITESEER, AMAP, BAT, EAT and UAT, respectively.
  • Figure 4: Sensitivity analysis of the number of graph Laplacian filters $r$.
  • Figure 5: Sensitivity analysis of the number of $\mathrm{MLP}$ layers $t$.
  • ...and 2 more figures