Table of Contents
Fetching ...

UGMAE: A Unified Framework for Graph Masked Autoencoders

Yijun Tian, Chuxu Zhang, Ziyi Kou, Zheyuan Liu, Xiangliang Zhang, Nitesh V. Chawla

TL;DR

UGMAE tackles four limitations of graph masked autoencoders by introducing adaptive masking, a ranking-based structure reconstruction, bootstrapping-based similarity, and consistency assurance. The approach combines local feature recovery with global structural and semantic guidance via momentum targets and self-distillation. Empirical results across node classification, graph classification, and molecular property prediction show UGMAE achieving state-of-the-art performance and robust reconstructions. The framework offers a scalable, semantically informed, and stable solution for self-supervised graph learning with broad practical impact.

Abstract

Generative self-supervised learning on graphs, particularly graph masked autoencoders, has emerged as a popular learning paradigm and demonstrated its efficacy in handling non-Euclidean data. However, several remaining issues limit the capability of existing methods: 1) the disregard of uneven node significance in masking, 2) the underutilization of holistic graph information, 3) the ignorance of semantic knowledge in the representation space due to the exclusive use of reconstruction loss in the output space, and 4) the unstable reconstructions caused by the large volume of masked contents. In light of this, we propose UGMAE, a unified framework for graph masked autoencoders to address these issues from the perspectives of adaptivity, integrity, complementarity, and consistency. Specifically, we first develop an adaptive feature mask generator to account for the unique significance of nodes and sample informative masks (adaptivity). We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information and emphasize the topological proximity between neighbors (integrity). After that, we present a bootstrapping-based similarity module to encode the high-level semantic knowledge in the representation space, complementary to the low-level reconstruction in the output space (complementarity). Finally, we build a consistency assurance module to provide reconstruction objectives with extra stabilized consistency targets (consistency). Extensive experiments demonstrate that UGMAE outperforms both contrastive and generative state-of-the-art baselines on several tasks across multiple datasets.

UGMAE: A Unified Framework for Graph Masked Autoencoders

TL;DR

UGMAE tackles four limitations of graph masked autoencoders by introducing adaptive masking, a ranking-based structure reconstruction, bootstrapping-based similarity, and consistency assurance. The approach combines local feature recovery with global structural and semantic guidance via momentum targets and self-distillation. Empirical results across node classification, graph classification, and molecular property prediction show UGMAE achieving state-of-the-art performance and robust reconstructions. The framework offers a scalable, semantically informed, and stable solution for self-supervised graph learning with broad practical impact.

Abstract

Generative self-supervised learning on graphs, particularly graph masked autoencoders, has emerged as a popular learning paradigm and demonstrated its efficacy in handling non-Euclidean data. However, several remaining issues limit the capability of existing methods: 1) the disregard of uneven node significance in masking, 2) the underutilization of holistic graph information, 3) the ignorance of semantic knowledge in the representation space due to the exclusive use of reconstruction loss in the output space, and 4) the unstable reconstructions caused by the large volume of masked contents. In light of this, we propose UGMAE, a unified framework for graph masked autoencoders to address these issues from the perspectives of adaptivity, integrity, complementarity, and consistency. Specifically, we first develop an adaptive feature mask generator to account for the unique significance of nodes and sample informative masks (adaptivity). We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information and emphasize the topological proximity between neighbors (integrity). After that, we present a bootstrapping-based similarity module to encode the high-level semantic knowledge in the representation space, complementary to the low-level reconstruction in the output space (complementarity). Finally, we build a consistency assurance module to provide reconstruction objectives with extra stabilized consistency targets (consistency). Extensive experiments demonstrate that UGMAE outperforms both contrastive and generative state-of-the-art baselines on several tasks across multiple datasets.
Paper Structure (19 sections, 10 equations, 5 figures, 4 tables)

This paper contains 19 sections, 10 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: (a) The overall framework of UGMAE: we first design mask generators to obtain masked graphs and then send them into the encoder/decoder pipeline to learn representations. Then, the learned representations are taken for the reconstruction and consistency assurance objectives in the output space. In addition, we calculate bootstrapping similarity in the representation space to capture the high-level semantic knowledge. (b) Adaptive feature mask generator: assigning high probability values for features with high reconstruction errors. (c) Ranking-based structure reconstruction: encouraging the connected nodes to be relatively similar. (d) Bootstrapping-based similarity: maximizing the agreement between learned and momentum representations. (e) Consistence assurance: minimizing the scaled cosine error between the constructed and momentum graph.
  • Figure 2: Ablation studies of model components.
  • Figure 3: The performance of UGMAE w.r.t. different values of feature mask rate $p_f$.
  • Figure 4: The performance of UGMAE w.r.t. different values of structure mask rate $p_s$.
  • Figure 5: Node representation visualization on the Cora dataset. Different colors indicate different node categories.