Table of Contents
Fetching ...

Improving Node Representation by Boosting Target-Aware Contrastive Loss

Ying-Chun Lin, Jennifer Neville

TL;DR

This work introduces Target-Aware Contrastive Learning (Target-aware CL) which aims to enhance target task performance by maximizing the mutual information between the target task and node representations with a self-supervised learning process.

Abstract

Graphs model complex relationships between entities, with nodes and edges capturing intricate connections. Node representation learning involves transforming nodes into low-dimensional embeddings. These embeddings are typically used as features for downstream tasks. Therefore, their quality has a significant impact on task performance. Existing approaches for node representation learning span (semi-)supervised, unsupervised, and self-supervised paradigms. In graph domains, (semi-)supervised learning often only optimizes models based on class labels, neglecting other abundant graph signals, which limits generalization. While self-supervised or unsupervised learning produces representations that better capture underlying graph signals, the usefulness of these captured signals for downstream target tasks can vary. To bridge this gap, we introduce Target-Aware Contrastive Learning (Target-aware CL) which aims to enhance target task performance by maximizing the mutual information between the target task and node representations with a self-supervised learning process. This is achieved through a sampling function, XGBoost Sampler (XGSampler), to sample proper positive examples for the proposed Target-Aware Contrastive Loss (XTCL). By minimizing XTCL, Target-aware CL increases the mutual information between the target task and node representations, such that model generalization is improved. Additionally, XGSampler enhances the interpretability of each signal by showing the weights for sampling the proper positive examples. We show experimentally that XTCL significantly improves the performance on two target tasks: node classification and link prediction tasks, compared to state-of-the-art models.

Improving Node Representation by Boosting Target-Aware Contrastive Loss

TL;DR

This work introduces Target-Aware Contrastive Learning (Target-aware CL) which aims to enhance target task performance by maximizing the mutual information between the target task and node representations with a self-supervised learning process.

Abstract

Graphs model complex relationships between entities, with nodes and edges capturing intricate connections. Node representation learning involves transforming nodes into low-dimensional embeddings. These embeddings are typically used as features for downstream tasks. Therefore, their quality has a significant impact on task performance. Existing approaches for node representation learning span (semi-)supervised, unsupervised, and self-supervised paradigms. In graph domains, (semi-)supervised learning often only optimizes models based on class labels, neglecting other abundant graph signals, which limits generalization. While self-supervised or unsupervised learning produces representations that better capture underlying graph signals, the usefulness of these captured signals for downstream target tasks can vary. To bridge this gap, we introduce Target-Aware Contrastive Learning (Target-aware CL) which aims to enhance target task performance by maximizing the mutual information between the target task and node representations with a self-supervised learning process. This is achieved through a sampling function, XGBoost Sampler (XGSampler), to sample proper positive examples for the proposed Target-Aware Contrastive Loss (XTCL). By minimizing XTCL, Target-aware CL increases the mutual information between the target task and node representations, such that model generalization is improved. Additionally, XGSampler enhances the interpretability of each signal by showing the weights for sampling the proper positive examples. We show experimentally that XTCL significantly improves the performance on two target tasks: node classification and link prediction tasks, compared to state-of-the-art models.
Paper Structure (19 sections, 24 equations, 5 figures, 4 tables, 2 algorithms)

This paper contains 19 sections, 24 equations, 5 figures, 4 tables, 2 algorithms.

Figures (5)

  • Figure 1: Model space consisting of models trained by different graph signals (text labels in each model space). The optimal model for a downstream task is represented by a yellow star. Different graph signals may result in varying degrees of overfitting or generalization error. Our goal is to increase the likelihood of finding the optimal model by optimizing it through adapting the information learned from different graph signals.
  • Figure 2: Learning curves of GCNs trained with various supervision signals. Our XGSampler is effective because XTCL continues to outperform others when the training size increases.
  • Figure 3: Results on semi-synthetic data when varying the dependency between node labels and a graph signal (1-Hop Attr. Dist.). The results show that minimizing losses, which are not task-aware, is not equivalent to maximizing the mutual information ${\mathcal{I}}({\bm{Y}};Z)$, which decreases model generalization on node classification.
  • Figure 4: Importance Weights of Semantic Relations. This figure demonstrates the importance of these graph signals (x-axis) to each downstream task and how our XGSampler adapt XTCL to improve model performance.
  • Figure 5: Training Time for Random Graph as graph size increases.

Theorems & Definitions (7)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7