Table of Contents
Fetching ...

ID-MixGCL: Identity Mixup for Graph Contrastive Learning

Gehang Zhang, Bowen Yu, Jiangxia Cao, Xinghua Zhang, Jiawei Sheng, Chuan Zhou, Tingwen Liu

TL;DR

ID-MixGCL addresses the misleading assumption in graph contrastive learning that augmentations preserve label semantics. It introduces identity-label mixup, jointly interpolating node representations and per-node identity labels to produce soft-confidence samples, and employs a two-view GCL framework with a two-layer projection head trained via an N-Pair contrastive loss. Empirically, ID-MixGCL achieves state-of-the-art results on eight node-classification datasets and five of six graph-classification datasets, with improvements up to 29.13% over prior methods; analyses show gains in alignment and uniformity and robustness to deeper architectures. The approach provides a generic, scalable baseline that improves robustness to augmentation-induced label changes and enhances the learning of fine-grained latent representations in unlabeled graphs.

Abstract

Graph contrastive learning (GCL) has recently achieved substantial advancements. Existing GCL approaches compare two different ``views'' of the same graph in order to learn node/graph representations. The underlying assumption of these studies is that the graph augmentation strategy is capable of generating several different graph views such that the graph views are structurally different but semantically similar to the original graphs, and thus the ground-truth labels of the original and augmented graph/nodes can be regarded identical in contrastive learning. However, we observe that this assumption does not always hold. For instance, the deletion of a super-node within a social network can exert a substantial influence on the partitioning of communities for other nodes. Similarly, any perturbation to nodes or edges in a molecular graph will change the labels of the graph. Therefore, we believe that augmenting the graph, accompanied by an adaptation of the labels used for the contrastive loss, will facilitate the encoder to learn a better representation. Based on this idea, we propose ID-MixGCL, which allows the simultaneous interpolation of input nodes and corresponding identity labels to obtain soft-confidence samples, with a controllable degree of change, leading to the capture of fine-grained representations from self-supervised training on unlabeled graphs. Experimental results demonstrate that ID-MixGCL improves performance on graph classification and node classification tasks, as demonstrated by significant improvements on the Cora, IMDB-B, IMDB-M, and PROTEINS datasets compared to state-of-the-art techniques, by 3-29% absolute points.

ID-MixGCL: Identity Mixup for Graph Contrastive Learning

TL;DR

ID-MixGCL addresses the misleading assumption in graph contrastive learning that augmentations preserve label semantics. It introduces identity-label mixup, jointly interpolating node representations and per-node identity labels to produce soft-confidence samples, and employs a two-view GCL framework with a two-layer projection head trained via an N-Pair contrastive loss. Empirically, ID-MixGCL achieves state-of-the-art results on eight node-classification datasets and five of six graph-classification datasets, with improvements up to 29.13% over prior methods; analyses show gains in alignment and uniformity and robustness to deeper architectures. The approach provides a generic, scalable baseline that improves robustness to augmentation-induced label changes and enhances the learning of fine-grained latent representations in unlabeled graphs.

Abstract

Graph contrastive learning (GCL) has recently achieved substantial advancements. Existing GCL approaches compare two different ``views'' of the same graph in order to learn node/graph representations. The underlying assumption of these studies is that the graph augmentation strategy is capable of generating several different graph views such that the graph views are structurally different but semantically similar to the original graphs, and thus the ground-truth labels of the original and augmented graph/nodes can be regarded identical in contrastive learning. However, we observe that this assumption does not always hold. For instance, the deletion of a super-node within a social network can exert a substantial influence on the partitioning of communities for other nodes. Similarly, any perturbation to nodes or edges in a molecular graph will change the labels of the graph. Therefore, we believe that augmenting the graph, accompanied by an adaptation of the labels used for the contrastive loss, will facilitate the encoder to learn a better representation. Based on this idea, we propose ID-MixGCL, which allows the simultaneous interpolation of input nodes and corresponding identity labels to obtain soft-confidence samples, with a controllable degree of change, leading to the capture of fine-grained representations from self-supervised training on unlabeled graphs. Experimental results demonstrate that ID-MixGCL improves performance on graph classification and node classification tasks, as demonstrated by significant improvements on the Cora, IMDB-B, IMDB-M, and PROTEINS datasets compared to state-of-the-art techniques, by 3-29% absolute points.
Paper Structure (32 sections, 7 equations, 7 figures, 7 tables, 1 algorithm)

This paper contains 32 sections, 7 equations, 7 figures, 7 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example to illustrate that augmentation on the graph can unexpectedly change the type of graph or nodes. The left panel shows that removing a carbon atom from the phenyl ring of aspirin causes the molecule to become an alkene chain, and the right panel shows that removing the link between Tom and Bob causes Tom to become an isolated node. This motivates us to generate 'soft-confidence' contrastive samples.
  • Figure 2: Overview of ID-MixGCL. The original graph $\mathcal{G}$ is used to augment two different but semantically similar views, where $\mathcal{G}'=\{\bm{X}',\bm{A}'\}$ and $\tilde{\mathcal{G}}=\{\tilde{\bm{X}},\tilde{\bm{A}}\}$. After that, we feed the graph views $\mathcal{G}'$ and $\tilde{\mathcal{G}}$ into a GNN encoder $f(\cdot)$ and obtain two views of node representation matrix ($\bm{H}$ and $\tilde{\bm{H}}$). Then we make a mixup operation on the representation matrix $\bm{H}$, where $\bm{H}'$ is the mixed node representation matrix. After passing a shared projection head $g(\cdot)$, we use a contrastive loss to maximize the agreement between representations $\bm{Z}'$ and $\tilde{\bm{Z}}$.
  • Figure 3: Illustration of our proposed identity mixup strategy. We select a different sample from the same batch and apply the mixup operation to it using a varied strategy for each sample.
  • Figure 4: $\mathcal{L}_{\texttt{align}}$-$\mathcal{L}_{\texttt{uniform}}$ plot of ID-MixGCL,GRACE, MVGRL and COSTA on Cora dataset. For both metrics, lower is better.
  • Figure 5: The analytic experiment compares the performance of different mixup strategies on node-level (e.g., Cora, CiteSeer) and graph-level (e.g., IMDB-M, PROTEINS) datasets. Here we show the two strategies on the PROTEINS, since LocalMixup exceeded 32GB GPU memory limit.
  • ...and 2 more figures