Table of Contents
Fetching ...

Hi-GMAE: Hierarchical Graph Masked Autoencoders

Chuang Liu, Zelin Yao, Yibing Zhan, Xueqi Ma, Dapeng Tao, Jia Wu, Wenbin Hu, Shirui Pan, Bo Du

TL;DR

Hi-GMAE tackles the limitation of single-scale graph masked autoencoders by introducing a multi-scale, hierarchical framework. It builds a graph hierarchy through pooling, enforces consistent masking across scales with CoFi masking and a gradual node-recovery strategy, and uses a fine-grained GNN encoder with a coarse-grained graph transformer, paired with a symmetric decoder. The approach yields state-of-the-art results in unsupervised graph representation learning and competitive transfer learning performance on molecular datasets, demonstrating the importance of hierarchical information in graph pre-training. This work broadens the applicability of self-supervised graph learning by effectively capturing both local and global graph structures, with practical implications for molecular and social graphs alike.

Abstract

Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance, molecular graphs exhibit a clear hierarchical organization in the form of the atoms-functional groups-molecules structure. Hence, the inability of single-scale GMAE models to incorporate these hierarchical relationships often leads to their inadequate capture of crucial high-level graph information, resulting in a noticeable decline in performance. To address this limitation, we propose Hierarchical Graph Masked AutoEncoders (Hi-GMAE), a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs. First, Hi-GMAE constructs a multi-scale graph hierarchy through graph pooling, enabling the exploration of graph structures across different granularity levels. To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to the finer scales. Furthermore, we integrate a gradual recovery strategy with the masking process to mitigate the learning challenges posed by completely masked subgraphs. Diverging from the standard graph neural network (GNN) used in GMAE models, Hi-GMAE modifies its encoder and decoder into hierarchical structures. This entails using GNN at the finer scales for detailed local graph analysis and employing a graph transformer at coarser scales to capture global information. Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors.

Hi-GMAE: Hierarchical Graph Masked Autoencoders

TL;DR

Hi-GMAE tackles the limitation of single-scale graph masked autoencoders by introducing a multi-scale, hierarchical framework. It builds a graph hierarchy through pooling, enforces consistent masking across scales with CoFi masking and a gradual node-recovery strategy, and uses a fine-grained GNN encoder with a coarse-grained graph transformer, paired with a symmetric decoder. The approach yields state-of-the-art results in unsupervised graph representation learning and competitive transfer learning performance on molecular datasets, demonstrating the importance of hierarchical information in graph pre-training. This work broadens the applicability of self-supervised graph learning by effectively capturing both local and global graph structures, with practical implications for molecular and social graphs alike.

Abstract

Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance, molecular graphs exhibit a clear hierarchical organization in the form of the atoms-functional groups-molecules structure. Hence, the inability of single-scale GMAE models to incorporate these hierarchical relationships often leads to their inadequate capture of crucial high-level graph information, resulting in a noticeable decline in performance. To address this limitation, we propose Hierarchical Graph Masked AutoEncoders (Hi-GMAE), a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs. First, Hi-GMAE constructs a multi-scale graph hierarchy through graph pooling, enabling the exploration of graph structures across different granularity levels. To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to the finer scales. Furthermore, we integrate a gradual recovery strategy with the masking process to mitigate the learning challenges posed by completely masked subgraphs. Diverging from the standard graph neural network (GNN) used in GMAE models, Hi-GMAE modifies its encoder and decoder into hierarchical structures. This entails using GNN at the finer scales for detailed local graph analysis and employing a graph transformer at coarser scales to capture global information. Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors.
Paper Structure (37 sections, 14 equations, 12 figures, 10 tables)

This paper contains 37 sections, 14 equations, 12 figures, 10 tables.

Figures (12)

  • Figure 1: a) The GMAE models focus on reconstructing randomly masked nodes using an autoencoder architecture, with a primary focus on the node level, categorizing them as single-scale approaches. b) The inherent hierarchical structure of a molecule graph, which contains node, subgraph, and overall graph level information, is termed as multi-scale.
  • Figure 2: Illustration of multi-scale graph generation and masking matrix construction. (a) Generating Coarse Graphs: We utilize the graph pooling algorithm, which clusters nodes into super-nodes, to generate the coarse graph. (b) CoFi Masking: We conduct random masking on the coarsest graph and then project the mask matrix onto the fine-grained graph through an unpooling process. This approach ensures that the masking pattern remains consistently aligned across different scales of the graph. Please note that the above two processes, including pooling and unpooling, are iterated across $S$ scales, although only three scales are presented here for the sake of clarity.
  • Figure 3: Illustration of the dynamic node recovery mechanism employed in the CoFi masking. At each epoch, a certain number of nodes are randomly selected for recovery from those that have been masked. The number of recovered nodes (i.e., $R$) gradually decreases along with the training procedure.
  • Figure 4: Overview of the proposed model. (a) Encoder: The input graph with applied masking first undergoes fine-grained graph convolution (GNN). Following that, the graph undergoes a pooling process. Subsequently, a coarse-grained GT layer (GT) is applied to the coarse graph to facilitate the learning of high-level information. (b) Decoder: Upon encoding the unmasked nodes, we employ an unpooling strategy to progressively recover the original graph across different scales. For detailed technical descriptions of masking, pooling, and unpooling, please refer to Section \ref{['sec:propose-method']}. $({\color[HTML]{6D6D6D} \boldsymbol{M}^{(2)}}, {\color[HTML]{6D6D6D} \boldsymbol{A}^{(2)}})$ in grey signifies that they undergo pre-processing before training.
  • Figure 5: Comparison of different encoder architectures on three datasets. Fi-Co denotes our proposed fine- and coarse-grained architecture. GNN-based/GT-based characterizes the approach of solely employing GNN/GT as the encoder at every scale.
  • ...and 7 more figures