Table of Contents
Fetching ...

Heterogeneous Graph Contrastive Learning with Meta-path Contexts and Adaptively Weighted Negative Samples

Jianxiang Yu, Qingqing Ge, Xiang Li, Aoying Zhou

TL;DR

This paper tackles contrastive learning on heterogeneous information networks by introducing MEOW, which builds a coarse view from meta-path connectivity and a fine-grained view that encodes meta-path contexts. It identifies limitations of the InfoNCE loss in differentiating hard and false negatives and addresses them with clustering-based hard negative weighting and prototypical learning, plus an AdaMEOW variant that learns soft, adaptive weights. The approach yields superior node classification and clustering performance on four public HIN datasets, with ablations confirming the impact of meta-path contexts, weighted negatives, and prototypical loss. The work offers a principled framework for leveraging rich meta-path information and informed negative sampling to improve representation learning in heterogeneous graphs, with practical implications for downstream analytics on complex networks.

Abstract

Heterogeneous graph contrastive learning has received wide attention recently. Some existing methods use meta-paths, which are sequences of object types that capture semantic relationships between objects, to construct contrastive views. However, most of them ignore the rich meta-path context information that describes how two objects are connected by meta-paths. Further, they fail to distinguish negative samples, which could adversely affect the model performance. To address the problems, we propose MEOW, which considers both meta-path contexts and weighted negative samples. Specifically, MEOW constructs a coarse view and a fine-grained view for contrast. The former reflects which objects are connected by meta-paths, while the latter uses meta-path contexts and characterizes details on how the objects are connected. Then, we theoretically analyze the InfoNCE loss and recognize its limitations for computing gradients of negative samples. To better distinguish negative samples, we learn hard-valued weights for them based on node clustering and use prototypical contrastive learning to pull close embeddings of nodes in the same cluster. In addition, we propose a variant model AdaMEOW that adaptively learns soft-valued weights of negative samples to further improve node representation. Finally, we conduct extensive experiments to show the superiority of MEOW and AdaMEOW against other state-of-the-art methods.

Heterogeneous Graph Contrastive Learning with Meta-path Contexts and Adaptively Weighted Negative Samples

TL;DR

This paper tackles contrastive learning on heterogeneous information networks by introducing MEOW, which builds a coarse view from meta-path connectivity and a fine-grained view that encodes meta-path contexts. It identifies limitations of the InfoNCE loss in differentiating hard and false negatives and addresses them with clustering-based hard negative weighting and prototypical learning, plus an AdaMEOW variant that learns soft, adaptive weights. The approach yields superior node classification and clustering performance on four public HIN datasets, with ablations confirming the impact of meta-path contexts, weighted negatives, and prototypical loss. The work offers a principled framework for leveraging rich meta-path information and informed negative sampling to improve representation learning in heterogeneous graphs, with practical implications for downstream analytics on complex networks.

Abstract

Heterogeneous graph contrastive learning has received wide attention recently. Some existing methods use meta-paths, which are sequences of object types that capture semantic relationships between objects, to construct contrastive views. However, most of them ignore the rich meta-path context information that describes how two objects are connected by meta-paths. Further, they fail to distinguish negative samples, which could adversely affect the model performance. To address the problems, we propose MEOW, which considers both meta-path contexts and weighted negative samples. Specifically, MEOW constructs a coarse view and a fine-grained view for contrast. The former reflects which objects are connected by meta-paths, while the latter uses meta-path contexts and characterizes details on how the objects are connected. Then, we theoretically analyze the InfoNCE loss and recognize its limitations for computing gradients of negative samples. To better distinguish negative samples, we learn hard-valued weights for them based on node clustering and use prototypical contrastive learning to pull close embeddings of nodes in the same cluster. In addition, we propose a variant model AdaMEOW that adaptively learns soft-valued weights of negative samples to further improve node representation. Finally, we conduct extensive experiments to show the superiority of MEOW and AdaMEOW against other state-of-the-art methods.
Paper Structure (25 sections, 2 theorems, 20 equations, 8 figures, 3 tables, 2 algorithms)

This paper contains 25 sections, 2 theorems, 20 equations, 8 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Consider the contrastive learning InfoNCE loss oord2018representation that uses dot product to measure node similarity, denoted as $\mathcal{L}$. Let $f(x)$ represent the learned embedding of node $x$. Given $x_i$ as an anchor, $x_k$ as its positive sample and $x_{t_1}, x_{t_2}$ as its two negative

Figures (8)

  • Figure 1: The overall framework of the MEOW model. For details of each step, see Section \ref{['method']}.
  • Figure 2: The relationship between node similarity $\text{sim}(\cdot)$ with a randomly selected anchor and gradient magnitude of loss functions w.r.t. negative samples after training 500 epochs on the (a) ACM dataset and 800 epochs on the (b) DBLP dataset. The orange dots indicate the InfoNCE loss and the blue dots indicate the loss function adopted by AdaMEOW.
  • Figure 3: The contrastive part of the AdaMEOW model.
  • Figure 4: The ablation study results of 40 labeled nodes per class.
  • Figure 5: Hyper-parameter analysis on the ACM dataset. Here, $K$ is the number of selected relevant neighbors w.r.t. a meta-path and $\lambda$ controls the relative importance of two components of the loss function in Eq. \ref{['meow_loss']}.
  • ...and 3 more figures

Theorems & Definitions (8)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 1
  • proof
  • Theorem 2
  • proof