Unsupervised Graph Clustering with Deep Structural Entropy
Jingyun Zhang, Hao Peng, Li Sun, Guanlin Wu, Chunyang Liu, Zhengtao Yu
TL;DR
DeSE tackles unsupervised graph clustering under sparse and noisy graphs by introducing Deep Structural Entropy as a differentiable objective. It combines a Structural Learning Layer that builds an attribute graph from node features with a Clustering Assignment Layer that learns embeddings and soft cluster assignments on an enhanced graph, optimized via a soft assignment structural entropy loss and an edge-based cross-entropy loss. The approach yields superior and interpretable clustering across four benchmarks, with robustness to the number of clusters and strong performance even when the original graph is imperfect. By uniting structural information theory with end-to-end graph learning, DeSE offers a principled, trainable mechanism to integrate features and structure for clustering.
Abstract
Research on Graph Structure Learning (GSL) provides key insights for graph-based clustering, yet current methods like Graph Neural Networks (GNNs), Graph Attention Networks (GATs), and contrastive learning often rely heavily on the original graph structure. Their performance deteriorates when the original graph's adjacency matrix is too sparse or contains noisy edges unrelated to clustering. Moreover, these methods depend on learning node embeddings and using traditional techniques like k-means to form clusters, which may not fully capture the underlying graph structure between nodes. To address these limitations, this paper introduces DeSE, a novel unsupervised graph clustering framework incorporating Deep Structural Entropy. It enhances the original graph with quantified structural information and deep neural networks to form clusters. Specifically, we first propose a method for calculating structural entropy with soft assignment, which quantifies structure in a differentiable form. Next, we design a Structural Learning layer (SLL) to generate an attributed graph from the original feature data, serving as a target to enhance and optimize the original structural graph, thereby mitigating the issue of sparse connections between graph nodes. Finally, our clustering assignment method (ASS), based on GNNs, learns node embeddings and a soft assignment matrix to cluster on the enhanced graph. The ASS layer can be stacked to meet downstream task requirements, minimizing structural entropy for stable clustering and maximizing node consistency with edge-based cross-entropy loss. Extensive comparative experiments are conducted on four benchmark datasets against eight representative unsupervised graph clustering baselines, demonstrating the superiority of the DeSE in both effectiveness and interpretability.
