Table of Contents
Fetching ...

Rethinking Graph Masked Autoencoders through Alignment and Uniformity

Liang Wang, Xiang Tao, Qiang Liu, Shu Wu, Liang Wang

TL;DR

The paper establishes a theoretical link between GraphMAE and graph contrastive learning by showing that node-level reconstruction implicitly enacts context-level alignment. It identifies alignment and uniformity gaps in GraphMAE due to random masking and lack of explicit distributional regularization. To address these gaps, it introduces AUG-MAE with an easy-to-hard adversarial masking strategy and an explicit uniformity regularizer, backed by theoretical bounds. Empirically, AUG-MAE demonstrates superior performance over GraphMAE and many baselines on node- and graph-classification benchmarks, while also exhibiting improved alignment and uniformity properties.

Abstract

Self-supervised learning on graphs can be bifurcated into contrastive and generative methods. Contrastive methods, also known as graph contrastive learning (GCL), have dominated graph self-supervised learning in the past few years, but the recent advent of graph masked autoencoder (GraphMAE) rekindles the momentum behind generative methods. Despite the empirical success of GraphMAE, there is still a dearth of theoretical understanding regarding its efficacy. Moreover, while both generative and contrastive methods have been shown to be effective, their connections and differences have yet to be thoroughly investigated. Therefore, we theoretically build a bridge between GraphMAE and GCL, and prove that the node-level reconstruction objective in GraphMAE implicitly performs context-level GCL. Based on our theoretical analysis, we further identify the limitations of the GraphMAE from the perspectives of alignment and uniformity, which have been considered as two key properties of high-quality representations in GCL. We point out that GraphMAE's alignment performance is restricted by the masking strategy, and the uniformity is not strictly guaranteed. To remedy the aforementioned limitations, we propose an Alignment-Uniformity enhanced Graph Masked AutoEncoder, named AUG-MAE. Specifically, we propose an easy-to-hard adversarial masking strategy to provide hard-to-align samples, which improves the alignment performance. Meanwhile, we introduce an explicit uniformity regularizer to ensure the uniformity of the learned representations. Experimental results on benchmark datasets demonstrate the superiority of our model over existing state-of-the-art methods.

Rethinking Graph Masked Autoencoders through Alignment and Uniformity

TL;DR

The paper establishes a theoretical link between GraphMAE and graph contrastive learning by showing that node-level reconstruction implicitly enacts context-level alignment. It identifies alignment and uniformity gaps in GraphMAE due to random masking and lack of explicit distributional regularization. To address these gaps, it introduces AUG-MAE with an easy-to-hard adversarial masking strategy and an explicit uniformity regularizer, backed by theoretical bounds. Empirically, AUG-MAE demonstrates superior performance over GraphMAE and many baselines on node- and graph-classification benchmarks, while also exhibiting improved alignment and uniformity properties.

Abstract

Self-supervised learning on graphs can be bifurcated into contrastive and generative methods. Contrastive methods, also known as graph contrastive learning (GCL), have dominated graph self-supervised learning in the past few years, but the recent advent of graph masked autoencoder (GraphMAE) rekindles the momentum behind generative methods. Despite the empirical success of GraphMAE, there is still a dearth of theoretical understanding regarding its efficacy. Moreover, while both generative and contrastive methods have been shown to be effective, their connections and differences have yet to be thoroughly investigated. Therefore, we theoretically build a bridge between GraphMAE and GCL, and prove that the node-level reconstruction objective in GraphMAE implicitly performs context-level GCL. Based on our theoretical analysis, we further identify the limitations of the GraphMAE from the perspectives of alignment and uniformity, which have been considered as two key properties of high-quality representations in GCL. We point out that GraphMAE's alignment performance is restricted by the masking strategy, and the uniformity is not strictly guaranteed. To remedy the aforementioned limitations, we propose an Alignment-Uniformity enhanced Graph Masked AutoEncoder, named AUG-MAE. Specifically, we propose an easy-to-hard adversarial masking strategy to provide hard-to-align samples, which improves the alignment performance. Meanwhile, we introduce an explicit uniformity regularizer to ensure the uniformity of the learned representations. Experimental results on benchmark datasets demonstrate the superiority of our model over existing state-of-the-art methods.
Paper Structure (32 sections, 20 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 32 sections, 20 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Distribution of nodes representations on the unit hypersphere learned by GCL (taking GRACE GRACE as an example) and GraphMAE GraphMAE. The representations learned by GCL is more uniformly distributed than GraphMAE.
  • Figure 2: The overall framework of our proposed AUG-MAE model. We propose an easy-to-hard adversarial masking strategy to provide hard-to-align positive pairs, so as to improve the alignment ability of GraphMAE. Additionally, we introduce an explicit uniformity regularizer $\mathcal{L}_{\mathrm{Uni}}$ into the objective to enhance the uniformity of learned representations.
  • Figure 3: Effect of different hyper-parameters. The y-axis represents accuracy(%).
  • Figure 4: $l_2$ distances between positive representations of Cora learned by GCL, GraphMAE, and AUG-MAE. The smaller mean distance indicates the better alignment.
  • Figure 5: Representation distributions of Cora on $\mathcal{S}^1$ learned by GCL, GraphMAE, and AUG-MAE. We plot distributions with Gaussian kernel density estimation in $\mathbb{R}^2$.
  • ...and 1 more figures