Disentangled Generative Graph Representation Learning

Xinyue Hu; Zhibin Duan; Xinyang Liu; Yuxin Li; Bo Chen; Mingyuan Zhou

Disentangled Generative Graph Representation Learning

Xinyue Hu, Zhibin Duan, Xinyang Liu, Yuxin Li, Bo Chen, Mingyuan Zhou

TL;DR

DiGGR addresses the lack of disentanglement and robustness in generative graph SSL by introducing latent factor learning to factorize graphs and guide factor-wise masking within a Disentangled Graph Masked Autoencoder. The framework jointly optimizes a Weibull variational encoder for latent factors, factor-specific graph factorization, and a masked autoencoder with graph-level and factor-wise reconstructions. Empirical results across 11 datasets for node and graph classification demonstrate that DiGGR achieves competitive or superior performance relative to strong self-supervised baselines and reveals clearer, more interpretable factor structures. The work advances practical graph representation learning by providing disentangled, end-to-end trained representations that can enhance robustness and explainability in downstream tasks.

Abstract

Recently, generative graph models have shown promising results in learning graph representations through self-supervised methods. However, most existing generative graph representation learning (GRL) approaches rely on random masking across the entire graph, which overlooks the entanglement of learned representations. This oversight results in non-robustness and a lack of explainability. Furthermore, disentangling the learned representations remains a significant challenge and has not been sufficiently explored in GRL research. Based on these insights, this paper introduces DiGGR (Disentangled Generative Graph Representation Learning), a self-supervised learning framework. DiGGR aims to learn latent disentangled factors and utilizes them to guide graph mask modeling, thereby enhancing the disentanglement of learned representations and enabling end-to-end joint learning. Extensive experiments on 11 public datasets for two different graph learning tasks demonstrate that DiGGR consistently outperforms many previous self-supervised methods, verifying the effectiveness of the proposed approach.

Disentangled Generative Graph Representation Learning

TL;DR

Abstract

Paper Structure (39 sections, 17 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 39 sections, 17 equations, 7 figures, 6 tables, 1 algorithm.

Introduction
Related works
Graph Self-Supervised Learning:
Disentangled Graph Learning:
Proposed Method
Preliminaries
Latent Factor Learning
Node Factorization:
Edge Factorization:
Variational Inference:
Disentangled Grpah Masked Autoencoder
Latent Factor-wise Grpah Masked Autoencoder
Graph-level Graph Mask Autoencoder:
Joint Training and Inference
Experiments
...and 24 more sections

Figures (7)

Figure 1: The number of latent factors is set to 4. In Fig. 1(a), the probabilities of nodes belonging to different latent groups are similar, resulting in nodes of the same type being incorrectly assigned to different factors. In contrast, Fig. 1(b) shows that the probabilities of node-factor affiliation are more discriminative, correctly categorizing nodes of the same type into the same latent group.
Figure 2: The overview of proposed DiGGR’s computation graph. The input data successively passes three modules described in Sections \ref{['subsec_3.2: latent_factor']} and \ref{['subsec_3.3: disentangled GMAE']}: Latent Factor Learning, Graph Factorization, and Disentangled Graph Mask Autoencoder. Graph information will be first processed through Latent Factor Learning and Graph Factorization, the former processed the input graph to get the latent factor $z$; the latter performs graph factorization via $z$, such that in each factorized subgraph, nodes exchange more information with intensively interacted neighbors. Hence, during the disentangled graph masking phase, we will individually mask each factorized subgraph to enhance the disentanglement of the obtained node representations.
Figure 3: T-SNE visualization of MUTAG dataset, where $z$ is the latent factor, $H$ is the learned node representation used for downstream tasks.
Figure 4: representation correlation matrix on Cora with number of factors $K = 4$. \ref{['fig: Disg_cora_graphmae']} depicts the representation of entanglement, while \ref{['fig: Disg_cora']} illustrates disentanglement.
Figure 5: Performance of the task under different choices of latent factor number $K$, where the horizontal axis represents the change in $K$ and the vertical axis is accuracy.
...and 2 more figures

Disentangled Generative Graph Representation Learning

TL;DR

Abstract

Disentangled Generative Graph Representation Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)