Table of Contents
Fetching ...

Rethinking Dimensional Rationale in Graph Contrastive Learning from Causal Perspective

Qirui Ji, Jiangmeng Li, Jie Hu, Rui Wang, Changwen Zheng, Fanjiang Xu

TL;DR

The paper tackles the problem that graph contrastive learning can overfit to task-agnostic information. It introduces dimensional rationale (DR) within a structural causal model, enabling a learnable DR weight and a redundancy-reducing objective, optimized via bi-level meta-learning and backdoor adjustment. The proposed DRGCL framework demonstrates improved discriminability and transfer across multiple graph benchmarks, supported by theoretical guarantees that DR reduces downstream risk and conditional variance. This causal-intervention approach yields robust graph representations with practical implications for improved transferability in real-world graph tasks.

Abstract

Graph contrastive learning is a general learning paradigm excelling at capturing invariant information from diverse perturbations in graphs. Recent works focus on exploring the structural rationale from graphs, thereby increasing the discriminability of the invariant information. However, such methods may incur in the mis-learning of graph models towards the interpretability of graphs, and thus the learned noisy and task-agnostic information interferes with the prediction of graphs. To this end, with the purpose of exploring the intrinsic rationale of graphs, we accordingly propose to capture the dimensional rationale from graphs, which has not received sufficient attention in the literature. The conducted exploratory experiments attest to the feasibility of the aforementioned roadmap. To elucidate the innate mechanism behind the performance improvement arising from the dimensional rationale, we rethink the dimensional rationale in graph contrastive learning from a causal perspective and further formalize the causality among the variables in the pre-training stage to build the corresponding structural causal model. On the basis of the understanding of the structural causal model, we propose the dimensional rationale-aware graph contrastive learning approach, which introduces a learnable dimensional rationale acquiring network and a redundancy reduction constraint. The learnable dimensional rationale acquiring network is updated by leveraging a bi-level meta-learning technique, and the redundancy reduction constraint disentangles the redundant features through a decorrelation process during learning. Empirically, compared with state-of-the-art methods, our method can yield significant performance boosts on various benchmarks with respect to discriminability and transferability. The code implementation of our method is available at https://github.com/ByronJi/DRGCL.

Rethinking Dimensional Rationale in Graph Contrastive Learning from Causal Perspective

TL;DR

The paper tackles the problem that graph contrastive learning can overfit to task-agnostic information. It introduces dimensional rationale (DR) within a structural causal model, enabling a learnable DR weight and a redundancy-reducing objective, optimized via bi-level meta-learning and backdoor adjustment. The proposed DRGCL framework demonstrates improved discriminability and transfer across multiple graph benchmarks, supported by theoretical guarantees that DR reduces downstream risk and conditional variance. This causal-intervention approach yields robust graph representations with practical implications for improved transferability in real-world graph tasks.

Abstract

Graph contrastive learning is a general learning paradigm excelling at capturing invariant information from diverse perturbations in graphs. Recent works focus on exploring the structural rationale from graphs, thereby increasing the discriminability of the invariant information. However, such methods may incur in the mis-learning of graph models towards the interpretability of graphs, and thus the learned noisy and task-agnostic information interferes with the prediction of graphs. To this end, with the purpose of exploring the intrinsic rationale of graphs, we accordingly propose to capture the dimensional rationale from graphs, which has not received sufficient attention in the literature. The conducted exploratory experiments attest to the feasibility of the aforementioned roadmap. To elucidate the innate mechanism behind the performance improvement arising from the dimensional rationale, we rethink the dimensional rationale in graph contrastive learning from a causal perspective and further formalize the causality among the variables in the pre-training stage to build the corresponding structural causal model. On the basis of the understanding of the structural causal model, we propose the dimensional rationale-aware graph contrastive learning approach, which introduces a learnable dimensional rationale acquiring network and a redundancy reduction constraint. The learnable dimensional rationale acquiring network is updated by leveraging a bi-level meta-learning technique, and the redundancy reduction constraint disentangles the redundant features through a decorrelation process during learning. Empirically, compared with state-of-the-art methods, our method can yield significant performance boosts on various benchmarks with respect to discriminability and transferability. The code implementation of our method is available at https://github.com/ByronJi/DRGCL.
Paper Structure (35 sections, 2 theorems, 17 equations, 10 figures, 9 tables, 1 algorithm)

This paper contains 35 sections, 2 theorems, 17 equations, 10 figures, 9 tables, 1 algorithm.

Key Result

Theorem 5.1

(Connecting Graph DR-aware Representations to Downstream Cross-Entropy Loss). Under the minimal assumption of GCL, i.e., the graph contrastive label is invariant to the distributions, when $\mathcal{R}$ is optimal, for any $\tilde{\boldsymbol{z}} \in \mathbb{R}$, the cross-entropy loss $\mathcal{L}_ where $M$ is negative samples' quantity, $D$ denotes the representation's dimensionality, $\tilde{\

Figures (10)

  • Figure 1: Experimental scatter diagrams obtained by GraphCL with randomly preserving dimensions on PROTEINS and RDT-B datasets. The red dashed lines denote the performance achieved by the primitive representation of GraphCL. The colored scattered points denote the downstream classification performance of embeddings with certain dimensions preserved. Note that the unreserved dimensions are directly valued by 0. The experimental principle emerges from the intuition that the prediction on downstream tasks may be significantly affected if the multi-dimensional representations are perturbed.
  • Figure 2: SCM for GCL pretraining.
  • Figure 3: Illustration of DRGCL. The solid blue line pointing backwards represents the regular training step. The solid red line pointing backwards represents the meta-learning step.
  • Figure 4: The visualization of the representations learned by GraphCL and our method using the redundancy reduction method on the BBBP dataset, respectively. The learned features are projected into a colored image in RGB format, where different colors represent different types of features. The abscissa axis represents the feature dimensions, and the ordinate axis represents samples of different classes. The greater the color contrast, the lower the dimensional feature similarity. These plots represent the similarity between dimension features with the first 64 samples of BBBP.
  • Figure 5: A counter-intuitive high-dimensional phenomenon in the problem of measuring concentration on a sphere. Almost the whole area of a high-dimensional sphere is concentrated in an $\epsilon$-strip around its equator and actually around any great circle.
  • ...and 5 more figures

Theorems & Definitions (3)

  • Theorem 5.1
  • Theorem 5.2
  • proof