Table of Contents
Fetching ...

Variational Graph Contrastive Learning

Shifeng Xie, Jhony H. Giraldo

TL;DR

This work proposes a novel Subgraph Gaussian Embedding Contrast (SGEC) method, which adaptively maps subgraphs to a structured Gaussian space, ensuring the preservation of graph characteristics while controlling the distribution of generated subgraphs.

Abstract

Graph representation learning (GRL) is a fundamental task in machine learning, aiming to encode high-dimensional graph-structured data into low-dimensional vectors. Self-supervised learning (SSL) methods are widely used in GRL because they can avoid expensive human annotation. In this work, we propose a novel Subgraph Gaussian Embedding Contrast (SGEC) method. Our approach introduces a subgraph Gaussian embedding module, which adaptively maps subgraphs to a structured Gaussian space, ensuring the preservation of graph characteristics while controlling the distribution of generated subgraphs. We employ optimal transport distances, including Wasserstein and Gromov-Wasserstein distances, to effectively measure the similarity between subgraphs, enhancing the robustness of the contrastive learning process. Extensive experiments across multiple benchmarks demonstrate that SGEC outperforms or presents competitive performance against state-of-the-art approaches. Our findings provide insights into the design of SSL methods for GRL, emphasizing the importance of the distribution of the generated contrastive pairs.

Variational Graph Contrastive Learning

TL;DR

This work proposes a novel Subgraph Gaussian Embedding Contrast (SGEC) method, which adaptively maps subgraphs to a structured Gaussian space, ensuring the preservation of graph characteristics while controlling the distribution of generated subgraphs.

Abstract

Graph representation learning (GRL) is a fundamental task in machine learning, aiming to encode high-dimensional graph-structured data into low-dimensional vectors. Self-supervised learning (SSL) methods are widely used in GRL because they can avoid expensive human annotation. In this work, we propose a novel Subgraph Gaussian Embedding Contrast (SGEC) method. Our approach introduces a subgraph Gaussian embedding module, which adaptively maps subgraphs to a structured Gaussian space, ensuring the preservation of graph characteristics while controlling the distribution of generated subgraphs. We employ optimal transport distances, including Wasserstein and Gromov-Wasserstein distances, to effectively measure the similarity between subgraphs, enhancing the robustness of the contrastive learning process. Extensive experiments across multiple benchmarks demonstrate that SGEC outperforms or presents competitive performance against state-of-the-art approaches. Our findings provide insights into the design of SSL methods for GRL, emphasizing the importance of the distribution of the generated contrastive pairs.

Paper Structure

This paper contains 14 sections, 11 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: T-SNE visualizations of the graph contrastive learning method GCA GCA, GSC GSC, and our method SGEC on the Cora dataset Cora. Each point in the visualization corresponds to a node embedding, with colors indicating classes. SGEC maps the node representations into a dense, uniform, and more linearly separable space.
  • Figure 2: Overview of the Subgraph Gaussian Embedding Contrast (SGEC) method. The SGEC method begins with a graph encoder to generate embeddings and employs a breadth-first search to sample subgraphs for some set of nodes $\mathcal{S}$. A Gaussian embedding module then produces contrastive samples in a Gaussian space of these subgraphs. Positive pairs consist of subgraphs with the same central node, while negative pairs have different central nodes. SGEC introduces the Wasserstein and Gromov-Wasserstein distances to compute distances between subgraphs for contrastive learning.
  • Figure 3: Sensitivity analysis of hyperparameter beta with standard deviation.
  • Figure 4: Sensitivity analysis of subgraph size $k$ on the Cora dataset. The plot shows the mean test accuracy (dark blue line) with error bars representing the mean $\pm$ 3 standard deviations (light blue dashed lines) for different subgraph sizes.