Generative-Contrastive Heterogeneous Graph Neural Network
Yu Wang, Lei Sang, Yi Zhang, Yiwen Zhang, Xindong Wu
TL;DR
This work targets heterogeneous information networks (HINs) and the limitations of self-supervised HGNNs, notably data augmentation and sampling biases. It introduces GC-HGNN, a generative-contrastive framework that couples a masked autoencoder-based meta-path view with hierarchical contrastive learning to capture both local one-hop and global higher-order information. Key contributions include the generative masked autoencoder for view augmentation, location-aware and semantic-aware positive sampling, and a hierarchical contrastive discriminator that jointly optimizes node classification and link prediction. Experiments on eight real-world datasets with seventeen baselines demonstrate substantial performance gains, validating the framework’s effectiveness in extracting heterogeneous information while mitigating augmentation biases. The work advances scalable, self-supervised HGNNs with practical implications for improved representation learning in complex networks.
Abstract
Heterogeneous Graphs (HGs) effectively model complex relationships in the real world through multi-type nodes and edges. In recent years, inspired by self-supervised learning (SSL), contrastive learning (CL)-based Heterogeneous Graphs Neural Networks (HGNNs) have shown great potential in utilizing data augmentation and contrastive discriminators for downstream tasks. However, data augmentation remains limited due to the graph data's integrity. Furthermore, the contrastive discriminators suffer from sampling bias and lack local heterogeneous information. To tackle the above limitations, we propose a novel Generative-Contrastive Heterogeneous Graph Neural Network (GC-HGNN). Specifically, we propose a heterogeneous graph generative learning method that enhances CL-based paradigm. This paradigm includes: 1) A contrastive view augmentation strategy using a masked autoencoder. 2) Position-aware and semantics-aware positive sample sampling strategy for generating hard negative samples. 3) A hierarchical contrastive learning strategy aimed at capturing local and global information. Furthermore, the hierarchical contrastive learning and sampling strategies aim to constitute an enhanced contrastive discriminator under the generative-contrastive perspective. Finally, we compare our model with seventeen baselines on eight real-world datasets. Our model outperforms the latest baselines on node classification and link prediction tasks.
