Masked Graph Autoencoder with Non-discrete Bandwidths
Ziwen Zhao, Yuhua Li, Yixiong Zou, Jiliang Tang, Ruixuan Li
TL;DR
This work tackles the limited topological informativeness of discrete TopoRec methods by introducing Bandana, a masked graph autoencoder with continuous bandwidth masks sampled from a Boltzmann-Gibbs distribution and a layer-wise bandwidth prediction objective. The authors show that continuous bandwidths preserve global graph connectivity and enable fine-grained neighborhood discrimination, while linking the training objective to regularized denoising in a topological space; they further reinterpret bandwidth prediction as gradient optimization of the topological encoding distribution. Empirically, Bandana outperforms representative baselines on self-supervised link prediction and node classification across a broad set of datasets, and the dot-product probing evaluation provides a fair assessment of encoder quality. The approach offers a new paradigm for structure-learning pretext tasks in graph SSL, with theoretical grounding and practical benefits for topology-aware representation learning. Bandana thus advances topological learning by moving away from discrete masking toward informative, continuous masking and prediction within a principled topological framework.
Abstract
Masked graph autoencoders have emerged as a powerful graph self-supervised learning method that has yet to be fully explored. In this paper, we unveil that the existing discrete edge masking and binary link reconstruction strategies are insufficient to learn topologically informative representations, from the perspective of message propagation on graph neural networks. These limitations include blocking message flows, vulnerability to over-smoothness, and suboptimal neighborhood discriminability. Inspired by these understandings, we explore non-discrete edge masks, which are sampled from a continuous and dispersive probability distribution instead of the discrete Bernoulli distribution. These masks restrict the amount of output messages for each edge, referred to as "bandwidths". We propose a novel, informative, and effective topological masked graph autoencoder using bandwidth masking and a layer-wise bandwidth prediction objective. We demonstrate its powerful graph topological learning ability both theoretically and empirically. Our proposed framework outperforms representative baselines in both self-supervised link prediction (improving the discrete edge reconstructors by at most 20%) and node classification on numerous datasets, solely with a structure-learning pretext. Our implementation is available at https://github.com/Newiz430/Bandana.
