Mixed Graph Contrastive Network for Semi-Supervised Node Classification

Xihong Yang; Yiqi Wang; Yue Liu; Yi Wen; Lingyuan Meng; Sihang Zhou; Xinwang Liu; En Zhu

Mixed Graph Contrastive Network for Semi-Supervised Node Classification

Xihong Yang, Yiqi Wang, Yue Liu, Yi Wen, Lingyuan Meng, Sihang Zhou, Xinwang Liu, En Zhu

TL;DR

This work proposes a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN), which improves the discriminative capability of the latent embeddings by an interpolation-based augmentation strategy and a correlation reduction mechanism.

Abstract

Graph Neural Networks (GNNs) have achieved promising performance in semi-supervised node classification in recent years. However, the problem of insufficient supervision, together with representation collapse, largely limits the performance of the GNNs in this field. To alleviate the collapse of node representations in semi-supervised scenario, we propose a novel graph contrastive learning method, termed Mixed Graph Contrastive Network (MGCN). In our method, we improve the discriminative capability of the latent embeddings by an interpolation-based augmentation strategy and a correlation reduction mechanism. Specifically, we first conduct the interpolation-based augmentation in the latent space and then force the prediction model to change linearly between samples. Second, we enable the learned network to tell apart samples across two interpolation-perturbed views through forcing the correlation matrix across views to approximate an identity matrix. By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discriminative representation learning. Extensive experimental results on six datasets demonstrate the effectiveness and the generality of MGCN compared to the existing state-of-the-art methods. The code of MGCN is available at https://github.com/xihongyang1999/MGCN on Github.

Mixed Graph Contrastive Network for Semi-Supervised Node Classification

TL;DR

Abstract

Paper Structure (26 sections, 3 theorems, 11 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 26 sections, 3 theorems, 11 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Semi-supervised Node Classification
Representation Collapse
Interpolation-based Augmentation
Method
Notations and Problem Definition
Graph Interpolation Module
Correlation Reduction Module
Loss Function
Theoretical Analysis
Experiment
Datasets & Metric
Experiment Setup
Performance Comparison
...and 11 more sections

Key Result

theorem 1

For any $\delta>0$ and $\gamma \in \Gamma$, for all $h_\gamma \in \mathcal{H}_\gamma$, with the probability at least $1 - \delta/|\Gamma|$, we have:

Figures (8)

Figure 1: Visualization of cosine similarity matrices of the output embeddings of GCN GCN, MixupForGraph GRAPH_MIXUP_1, MVGRLMVGRL and our proposed method on the ACM dataset. The sample order is rearranged to make samples from the same cluster beside each other. The higher value (red) indicates that embeddings are more similar, thus easy leading to representation collapsing. The lower value (blue) denotes that the embeddings are less similar.
Figure 2: Illustration of Mixed Graph Contrastive Network (MGCN). In the Graph Interpolation Module, with the generated embedding $\mathbf{H}$, we first adopt the interpolation-based strategy to conduct data augmentation in the latent space and then by guiding $\mathbf{H}^{v_2}$ to approximate the prediction $\mathbf{Y}^{v_2}$, we force the prediction model to change linearly between samples. Afterward, by guiding the cross-view correlation matrix to approximate the identity matrix, we enable the learned network to tell apart samples across two interpolation-perturbed views. In this manner, our network would be guided to learn the more discriminative embedding, thus alleviating representation collapse. In our model, the interpolation rate $\lambda$ is set as $0.95$ to make sure that $\mathbf{H}^{v_k}$ is a perturbation of $\mathbf{H}$.
Figure 3: $t$-SNE visualization of seven methods on two datasets. The first row and second row correspond to ACM and DBLP, respectively.
Figure 4: Ablation comparisons of the proposed modules on six datasets. "B", "B+I", "B+C" and "Ours" denote the baseline, the baseline with graph interpolation module, correlation reduction module and both, respectively.
Figure 5: Testing of the effectiveness and sensitivity of hyper-parameter $\alpha$ and $\lambda$. The result perturbation with the variation of the two parameters on all six datasets are illustrated in the figures.
...and 3 more figures

Theorems & Definitions (3)

theorem 1
lemma 1
lemma 2

Mixed Graph Contrastive Network for Semi-Supervised Node Classification

TL;DR

Abstract

Mixed Graph Contrastive Network for Semi-Supervised Node Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (3)