Table of Contents
Fetching ...

Dual-Kernel Graph Community Contrastive Learning

Xiang Chen, Kun Yue, Wenjie Liu, Zhenyu Zhang, Liang Duan

TL;DR

This work tackles the scalability and latency challenges of graph contrastive learning on large graphs by introducing Dual-Kernel Graph Community Contrastive Learning (DKGCCL). It preserves node-level detail while exploiting community structure through bi-level features and MKL-based dual kernels, achieving linear-time training and reduced inference cost via a decoupled GNN and distillation to a lightweight MLP. Theoretical analyses show the dual-kernel GCCL loss can approximate multi-hop diffusion-based contrastive objectives and that the distillation objective increases mutual information with $K$-hop patterns, supporting robust downstream performance. Empirically, DKGCCL delivers state-of-the-art or competitive results across 16 real-world datasets, with strong scalability, linear inference, and substantial speedups on large graphs, validating its practical impact for large-scale graph representation learning.

Abstract

Graph Contrastive Learning (GCL) has emerged as a powerful paradigm for training Graph Neural Networks (GNNs) in the absence of task-specific labels. However, its scalability on large-scale graphs is hindered by the intensive message passing mechanism of GNN and the quadratic computational complexity of contrastive loss over positive and negative node pairs. To address these issues, we propose an efficient GCL framework that transforms the input graph into a compact network of interconnected node sets while preserving structural information across communities. We firstly introduce a kernelized graph community contrastive loss with linear complexity, enabling effective information transfer among node sets to capture hierarchical structural information of the graph. We then incorporate a knowledge distillation technique into the decoupled GNN architecture to accelerate inference while maintaining strong generalization performance. Extensive experiments on sixteen real-world datasets of varying scales demonstrate that our method outperforms state-of-the-art GCL baselines in both effectiveness and scalability.

Dual-Kernel Graph Community Contrastive Learning

TL;DR

This work tackles the scalability and latency challenges of graph contrastive learning on large graphs by introducing Dual-Kernel Graph Community Contrastive Learning (DKGCCL). It preserves node-level detail while exploiting community structure through bi-level features and MKL-based dual kernels, achieving linear-time training and reduced inference cost via a decoupled GNN and distillation to a lightweight MLP. Theoretical analyses show the dual-kernel GCCL loss can approximate multi-hop diffusion-based contrastive objectives and that the distillation objective increases mutual information with -hop patterns, supporting robust downstream performance. Empirically, DKGCCL delivers state-of-the-art or competitive results across 16 real-world datasets, with strong scalability, linear inference, and substantial speedups on large graphs, validating its practical impact for large-scale graph representation learning.

Abstract

Graph Contrastive Learning (GCL) has emerged as a powerful paradigm for training Graph Neural Networks (GNNs) in the absence of task-specific labels. However, its scalability on large-scale graphs is hindered by the intensive message passing mechanism of GNN and the quadratic computational complexity of contrastive loss over positive and negative node pairs. To address these issues, we propose an efficient GCL framework that transforms the input graph into a compact network of interconnected node sets while preserving structural information across communities. We firstly introduce a kernelized graph community contrastive loss with linear complexity, enabling effective information transfer among node sets to capture hierarchical structural information of the graph. We then incorporate a knowledge distillation technique into the decoupled GNN architecture to accelerate inference while maintaining strong generalization performance. Extensive experiments on sixteen real-world datasets of varying scales demonstrate that our method outperforms state-of-the-art GCL baselines in both effectiveness and scalability.

Paper Structure

This paper contains 60 sections, 11 theorems, 53 equations, 8 figures, 6 tables, 1 algorithm.

Key Result

Proposition 1

Let the feature dimension of the community-level feature space $\mathcal{X}_P$ be $d^P$. Then, the expected number of distinct partitioned substructures generated by the Dropout($\cdot$) operation for each partition $P_j$ is: where $P^s_{j}$ is a substructure of $P_j$ on the feature dimension $s$, and $p$ is the dropout probability.

Figures (8)

  • Figure 1: The overall framework of our method.
  • Figure 2: Inference efficiency comparison.
  • Figure 3: Comparison of different dual-kernel graph community contrastive loss variants.
  • Figure 4: Comparison with StructComp.
  • Figure 5: Augmentation strategy comparison with StructComp. Our method can be regarded as constructing a random substructure along each feature dimension, where such diversity of substructures helps enhance generalization ability. In contrast, DropMember directly discards all feature dimensions of certain nodes, which only produces a single substructure.
  • ...and 3 more figures

Theorems & Definitions (19)

  • Proposition 1
  • Definition 1
  • Definition 2
  • Definition 3
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Proposition 5
  • Proposition 6
  • proof
  • ...and 9 more