Table of Contents
Fetching ...

Generative-Contrastive Heterogeneous Graph Neural Network

Yu Wang, Lei Sang, Yi Zhang, Yiwen Zhang, Xindong Wu

TL;DR

This work targets heterogeneous information networks (HINs) and the limitations of self-supervised HGNNs, notably data augmentation and sampling biases. It introduces GC-HGNN, a generative-contrastive framework that couples a masked autoencoder-based meta-path view with hierarchical contrastive learning to capture both local one-hop and global higher-order information. Key contributions include the generative masked autoencoder for view augmentation, location-aware and semantic-aware positive sampling, and a hierarchical contrastive discriminator that jointly optimizes node classification and link prediction. Experiments on eight real-world datasets with seventeen baselines demonstrate substantial performance gains, validating the framework’s effectiveness in extracting heterogeneous information while mitigating augmentation biases. The work advances scalable, self-supervised HGNNs with practical implications for improved representation learning in complex networks.

Abstract

Heterogeneous Graphs (HGs) effectively model complex relationships in the real world through multi-type nodes and edges. In recent years, inspired by self-supervised learning (SSL), contrastive learning (CL)-based Heterogeneous Graphs Neural Networks (HGNNs) have shown great potential in utilizing data augmentation and contrastive discriminators for downstream tasks. However, data augmentation remains limited due to the graph data's integrity. Furthermore, the contrastive discriminators suffer from sampling bias and lack local heterogeneous information. To tackle the above limitations, we propose a novel Generative-Contrastive Heterogeneous Graph Neural Network (GC-HGNN). Specifically, we propose a heterogeneous graph generative learning method that enhances CL-based paradigm. This paradigm includes: 1) A contrastive view augmentation strategy using a masked autoencoder. 2) Position-aware and semantics-aware positive sample sampling strategy for generating hard negative samples. 3) A hierarchical contrastive learning strategy aimed at capturing local and global information. Furthermore, the hierarchical contrastive learning and sampling strategies aim to constitute an enhanced contrastive discriminator under the generative-contrastive perspective. Finally, we compare our model with seventeen baselines on eight real-world datasets. Our model outperforms the latest baselines on node classification and link prediction tasks.

Generative-Contrastive Heterogeneous Graph Neural Network

TL;DR

This work targets heterogeneous information networks (HINs) and the limitations of self-supervised HGNNs, notably data augmentation and sampling biases. It introduces GC-HGNN, a generative-contrastive framework that couples a masked autoencoder-based meta-path view with hierarchical contrastive learning to capture both local one-hop and global higher-order information. Key contributions include the generative masked autoencoder for view augmentation, location-aware and semantic-aware positive sampling, and a hierarchical contrastive discriminator that jointly optimizes node classification and link prediction. Experiments on eight real-world datasets with seventeen baselines demonstrate substantial performance gains, validating the framework’s effectiveness in extracting heterogeneous information while mitigating augmentation biases. The work advances scalable, self-supervised HGNNs with practical implications for improved representation learning in complex networks.

Abstract

Heterogeneous Graphs (HGs) effectively model complex relationships in the real world through multi-type nodes and edges. In recent years, inspired by self-supervised learning (SSL), contrastive learning (CL)-based Heterogeneous Graphs Neural Networks (HGNNs) have shown great potential in utilizing data augmentation and contrastive discriminators for downstream tasks. However, data augmentation remains limited due to the graph data's integrity. Furthermore, the contrastive discriminators suffer from sampling bias and lack local heterogeneous information. To tackle the above limitations, we propose a novel Generative-Contrastive Heterogeneous Graph Neural Network (GC-HGNN). Specifically, we propose a heterogeneous graph generative learning method that enhances CL-based paradigm. This paradigm includes: 1) A contrastive view augmentation strategy using a masked autoencoder. 2) Position-aware and semantics-aware positive sample sampling strategy for generating hard negative samples. 3) A hierarchical contrastive learning strategy aimed at capturing local and global information. Furthermore, the hierarchical contrastive learning and sampling strategies aim to constitute an enhanced contrastive discriminator under the generative-contrastive perspective. Finally, we compare our model with seventeen baselines on eight real-world datasets. Our model outperforms the latest baselines on node classification and link prediction tasks.
Paper Structure (21 sections, 15 equations, 8 figures, 7 tables)

This paper contains 21 sections, 15 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: (a) Contrastive Learning paradigm aligns two similar views constructed through random edge dropping; (b) Generative-Contrastive paradigm aligns two reconstructed embeddings without altering the original graph structure.
  • Figure 2: A toy example of an HIN on ACM dataset.
  • Figure 3: The training framework of our proposed GC-HGNN. (a) The model respectively takes the entire heterogeneous graph and the views based on meta-paths, dividing them into inputs for the network schema View and the meta-path view; (b) the network schema view employs an attention mechanism to focus on one-hop neighbors of each node; (c) the meta-path view utilizes a masked autoencoder to generate views for intra-contrast; (d) the hierarchical contrast uses a proposed dynamic sampling mechanism to generate more hard negative samples to enhance the contrastive discriminator.
  • Figure 4: Our GC-HGNN proposed a hard sampling strategy for inter-contrast. Location-aware Pos: the green node uses metapath2vec 2017metapath2vec to perceive the location within a heterogeneous graph to obtain node embeddings. Semantic-aware Pos: The blue node is connected to the anchor node through all types of meta-paths, indicating the semantic similarity between them.
  • Figure 5: The comparison of GC-HGNN and its variants.
  • ...and 3 more figures