Table of Contents
Fetching ...

Self-Supervised Contrastive Graph Clustering Network via Structural Information Fusion

Xiaoyang Ji, Yuchen Zhou, Haofu Yang, Shiyue Xu, Jiahao Li

TL;DR

The paper tackles graph clustering by addressing the unreliability of priors learned during pre-training in deep clustering methods. It introduces CGCN, which integrates Autoencoder and Graph Autoencoder pre-training with a contrastive learning module and a hierarchical structural-attribute fusion to strengthen priors, producing a refined final embedding $Z_{final}$. Clustering targets are guided by a triplet self-supervised loss $L_{KL}$ that aligns module outputs via a target distribution $p_{ij}$ derived from a Student's t-kernel $q_{ij}$ and the fused embeddings from AE and GAE. Empirical evaluation on DBLP, CITE, and ACM shows consistent improvements over baselines like DFCN, validating the benefit of cross-module contrast and higher-order information fusion for robust graph clustering.

Abstract

Graph clustering, a classical task in graph learning, involves partitioning the nodes of a graph into distinct clusters. This task has applications in various real-world scenarios, such as anomaly detection, social network analysis, and community discovery. Current graph clustering methods commonly rely on module pre-training to obtain a reliable prior distribution for the model, which is then used as the optimization objective. However, these methods often overlook deeper supervised signals, leading to sub-optimal reliability of the prior distribution. To address this issue, we propose a novel deep graph clustering method called CGCN. Our approach introduces contrastive signals and deep structural information into the pre-training process. Specifically, CGCN utilizes a contrastive learning mechanism to foster information interoperability among multiple modules and allows the model to adaptively adjust the degree of information aggregation for different order structures. Our CGCN method has been experimentally validated on multiple real-world graph datasets, showcasing its ability to boost the dependability of prior clustering distributions acquired through pre-training. As a result, we observed notable enhancements in the performance of the model.

Self-Supervised Contrastive Graph Clustering Network via Structural Information Fusion

TL;DR

The paper tackles graph clustering by addressing the unreliability of priors learned during pre-training in deep clustering methods. It introduces CGCN, which integrates Autoencoder and Graph Autoencoder pre-training with a contrastive learning module and a hierarchical structural-attribute fusion to strengthen priors, producing a refined final embedding . Clustering targets are guided by a triplet self-supervised loss that aligns module outputs via a target distribution derived from a Student's t-kernel and the fused embeddings from AE and GAE. Empirical evaluation on DBLP, CITE, and ACM shows consistent improvements over baselines like DFCN, validating the benefit of cross-module contrast and higher-order information fusion for robust graph clustering.

Abstract

Graph clustering, a classical task in graph learning, involves partitioning the nodes of a graph into distinct clusters. This task has applications in various real-world scenarios, such as anomaly detection, social network analysis, and community discovery. Current graph clustering methods commonly rely on module pre-training to obtain a reliable prior distribution for the model, which is then used as the optimization objective. However, these methods often overlook deeper supervised signals, leading to sub-optimal reliability of the prior distribution. To address this issue, we propose a novel deep graph clustering method called CGCN. Our approach introduces contrastive signals and deep structural information into the pre-training process. Specifically, CGCN utilizes a contrastive learning mechanism to foster information interoperability among multiple modules and allows the model to adaptively adjust the degree of information aggregation for different order structures. Our CGCN method has been experimentally validated on multiple real-world graph datasets, showcasing its ability to boost the dependability of prior clustering distributions acquired through pre-training. As a result, we observed notable enhancements in the performance of the model.
Paper Structure (16 sections, 20 equations, 3 figures, 2 tables)

This paper contains 16 sections, 20 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overall Framework
  • Figure 2: Ablation Study.
  • Figure 3: Parameter sensitivity study.