Improving Node Representation by Boosting Target-Aware Contrastive Loss

Ying-Chun Lin; Jennifer Neville

Improving Node Representation by Boosting Target-Aware Contrastive Loss

Ying-Chun Lin, Jennifer Neville

TL;DR

This work introduces Target-Aware Contrastive Learning (Target-aware CL) which aims to enhance target task performance by maximizing the mutual information between the target task and node representations with a self-supervised learning process.

Abstract

Graphs model complex relationships between entities, with nodes and edges capturing intricate connections. Node representation learning involves transforming nodes into low-dimensional embeddings. These embeddings are typically used as features for downstream tasks. Therefore, their quality has a significant impact on task performance. Existing approaches for node representation learning span (semi-)supervised, unsupervised, and self-supervised paradigms. In graph domains, (semi-)supervised learning often only optimizes models based on class labels, neglecting other abundant graph signals, which limits generalization. While self-supervised or unsupervised learning produces representations that better capture underlying graph signals, the usefulness of these captured signals for downstream target tasks can vary. To bridge this gap, we introduce Target-Aware Contrastive Learning (Target-aware CL) which aims to enhance target task performance by maximizing the mutual information between the target task and node representations with a self-supervised learning process. This is achieved through a sampling function, XGBoost Sampler (XGSampler), to sample proper positive examples for the proposed Target-Aware Contrastive Loss (XTCL). By minimizing XTCL, Target-aware CL increases the mutual information between the target task and node representations, such that model generalization is improved. Additionally, XGSampler enhances the interpretability of each signal by showing the weights for sampling the proper positive examples. We show experimentally that XTCL significantly improves the performance on two target tasks: node classification and link prediction tasks, compared to state-of-the-art models.

Improving Node Representation by Boosting Target-Aware Contrastive Loss

TL;DR

Abstract

Paper Structure (19 sections, 24 equations, 5 figures, 4 tables, 2 algorithms)

This paper contains 19 sections, 24 equations, 5 figures, 4 tables, 2 algorithms.

Introduction
Notations and Preliminaries
Node Contrastive Learning
Learning Node Representations with Task-Aware Contrastive Loss
Task-Aware Contrastive Loss
XGboost Sampler Training
Computation Complexity Analysis
Performance Evaluation
RQ1. Performance on Downstream Tasks
RQ2. Comparing Various Supervision Signals
RQ3. Importance of Task-Aware Loss Function
Importance Weights of Semantic Relations
Empirical Training Time Analysis
Related Work
Conclusion
...and 4 more sections

Figures (5)

Figure 1: Model space consisting of models trained by different graph signals (text labels in each model space). The optimal model for a downstream task is represented by a yellow star. Different graph signals may result in varying degrees of overfitting or generalization error. Our goal is to increase the likelihood of finding the optimal model by optimizing it through adapting the information learned from different graph signals.
Figure 2: Learning curves of GCNs trained with various supervision signals. Our XGSampler is effective because XTCL continues to outperform others when the training size increases.
Figure 3: Results on semi-synthetic data when varying the dependency between node labels and a graph signal (1-Hop Attr. Dist.). The results show that minimizing losses, which are not task-aware, is not equivalent to maximizing the mutual information ${\mathcal{I}}({\bm{Y}};Z)$, which decreases model generalization on node classification.
Figure 4: Importance Weights of Semantic Relations. This figure demonstrates the importance of these graph signals (x-axis) to each downstream task and how our XGSampler adapt XTCL to improve model performance.
Figure 5: Training Time for Random Graph as graph size increases.

Theorems & Definitions (7)

Definition 1
Definition 2
Definition 3
Definition 4
Definition 5
Definition 6
Definition 7

Improving Node Representation by Boosting Target-Aware Contrastive Loss

TL;DR

Abstract

Improving Node Representation by Boosting Target-Aware Contrastive Loss

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (7)