Table of Contents
Fetching ...

Exploring Correlations of Self-Supervised Tasks for Graphs

Taoran Fang, Wei Zhou, Yifei Sun, Kaiqiao Han, Lvbin Ma, Yang Yang

TL;DR

The work addresses the paucity of understanding around correlations among graph self-supervised tasks and the universality of learned representations. It introduces Cor$(t_1,t_2)$, ATD, and ARL to quantify cross-task expressiveness and difficulty, revealing that correlations are dataset-specific and that naïve multi-task training often fails to yield universally strong representations. To tackle this, the authors propose GraphTCM, a correlation-modeling module that predicts inter-task correlations from task representations via a learned exponential attention mechanism and congruent loss, enabling training of representations with high cross-task capability. Empirically, GraphTCM not only reconstructs observed correlations with low error but also yields representations that perform best on downstream node classification and link prediction across multiple datasets, significantly outperforming baseline multi-task and mixing strategies. The approach offers a principled way to model task relationships in graph SSL and demonstrates practical gains in robustness and generalization.

Abstract

Graph self-supervised learning has sparked a research surge in training informative representations without accessing any labeled data. However, our understanding of graph self-supervised learning remains limited, and the inherent relationships between various self-supervised tasks are still unexplored. Our paper aims to provide a fresh understanding of graph self-supervised learning based on task correlations. Specifically, we evaluate the performance of the representations trained by one specific task on other tasks and define correlation values to quantify task correlations. Through this process, we unveil the task correlations between various self-supervised tasks and can measure their expressive capabilities, which are closely related to downstream performance. By analyzing the correlation values between tasks across various datasets, we reveal the complexity of task correlations and the limitations of existing multi-task learning methods. To obtain more capable representations, we propose Graph Task Correlation Modeling (GraphTCM) to illustrate the task correlations and utilize it to enhance graph self-supervised training. The experimental results indicate that our method significantly outperforms existing methods across various downstream tasks.

Exploring Correlations of Self-Supervised Tasks for Graphs

TL;DR

The work addresses the paucity of understanding around correlations among graph self-supervised tasks and the universality of learned representations. It introduces Cor, ATD, and ARL to quantify cross-task expressiveness and difficulty, revealing that correlations are dataset-specific and that naïve multi-task training often fails to yield universally strong representations. To tackle this, the authors propose GraphTCM, a correlation-modeling module that predicts inter-task correlations from task representations via a learned exponential attention mechanism and congruent loss, enabling training of representations with high cross-task capability. Empirically, GraphTCM not only reconstructs observed correlations with low error but also yields representations that perform best on downstream node classification and link prediction across multiple datasets, significantly outperforming baseline multi-task and mixing strategies. The approach offers a principled way to model task relationships in graph SSL and demonstrates practical gains in robustness and generalization.

Abstract

Graph self-supervised learning has sparked a research surge in training informative representations without accessing any labeled data. However, our understanding of graph self-supervised learning remains limited, and the inherent relationships between various self-supervised tasks are still unexplored. Our paper aims to provide a fresh understanding of graph self-supervised learning based on task correlations. Specifically, we evaluate the performance of the representations trained by one specific task on other tasks and define correlation values to quantify task correlations. Through this process, we unveil the task correlations between various self-supervised tasks and can measure their expressive capabilities, which are closely related to downstream performance. By analyzing the correlation values between tasks across various datasets, we reveal the complexity of task correlations and the limitations of existing multi-task learning methods. To obtain more capable representations, we propose Graph Task Correlation Modeling (GraphTCM) to illustrate the task correlations and utilize it to enhance graph self-supervised training. The experimental results indicate that our method significantly outperforms existing methods across various downstream tasks.
Paper Structure (26 sections, 2 theorems, 26 equations, 4 figures, 8 tables)

This paper contains 26 sections, 2 theorems, 26 equations, 4 figures, 8 tables.

Key Result

Theorem 3.4

Given two task $t_1,t_2\in \mathcal{T}$ and their trained representations $\mathbf{H}_{t_1},\mathbf{H}_{t_2}$, the error of $\mathbf{H}_{t_2}$ on the downstream task is $e_{t_2}$, which can be expressed as: Then, the error of $\mathbf{H}_{t_1}$ on the downstream task satisfies: where $\triangle=\Vert \mathbf{Y}_{t_2}-\mathbf{Y}_{t_{ds}}\Vert+\Vert \mathbf{H}_{t_2}\cdot(\hat{\mathbf{W}_{t_2}^*}-\

Figures (4)

  • Figure 1: The real correlation values on various graph datasets. Here, the $y$-axis represents the task used to train the assessed representations, while the $x$-axis represents the self-supervised task on which the representations are evaluated. For instance, the intersection of the DGI row and the GAE column is denoted as $\text{Cor}(\text{DGI}, \text{GAE})$ according to Formula \ref{['formula:cor']}, signifying the comparative performance of the representations trained by DGI on the GAE task. A lower value indicates better performance.
  • Figure 2: The modeling capability of GraphTCM. The top plots display real correlation value matrices, while the bottom plots exhibit reconstructed correlation value matrices from GraphTCM.
  • Figure 3: The generalization capability of GraphTCM. The top plots display real correlation value matrices of three unseen tasks (PairAttSim, GRACE and GraphMAE), while the bottom plots exhibit predicted correlation value matrices from GraphTCM.
  • Figure 4: The statistics of ARL values for representations trained by GraphTCM and base tasks ("Optimum" represents the lowest ARL values among base tasks while "Average" represents the average ARL values for base tasks).

Theorems & Definitions (7)

  • Definition 3.1
  • Definition 3.2
  • Definition 3.3
  • Theorem 3.4
  • Theorem 3.5
  • proof
  • proof