Table of Contents
Fetching ...

Training MLPs on Graphs without Supervision

Zehong Wang, Zheyuan Zhang, Chuxu Zhang, Yanfang Ye

TL;DR

SimMLP tackles the latency bottleneck of neighborhood-based GNN inference by training MLPs on graphs through self-supervised alignment with a GNN encoder. The method defines a mutual-information–driven objective that links structure-aware GNN embeddings with structure-free MLP embeddings, augmented by a reconstruction term and safeguards against collapse. The authors prove an information-theoretic basis for why SimMLP can be equivalent to GNNs in the optimal limit and demonstrate strong empirical performance across node classification, link prediction, and graph classification, with particular gains in inductive and cold-start scenarios. Importantly, SimMLP yields large inference speedups (roughly 90–126×) and robust performance under feature/edge noise and scarce labels, enabling scalable and versatile graph learning with MLPs. $\mathcal{L} = \sum_{i} \|\rho({\mathbf{h}}_i^{MLP}) - {\mathbf{h}}_i^{GNN}\|^2 + \lambda \|{\mathcal{D}}({\mathbf{h}}_i^{GNN}) - {\mathbf{x}}_i\|^2$, together with mutual information considerations, underpins the theoretical and practical contributions.

Abstract

Graph Neural Networks (GNNs) have demonstrated their effectiveness in various graph learning tasks, yet their reliance on neighborhood aggregation during inference poses challenges for deployment in latency-sensitive applications, such as real-time financial fraud detection. To address this limitation, recent studies have proposed distilling knowledge from teacher GNNs into student Multi-Layer Perceptrons (MLPs) trained on node content, aiming to accelerate inference. However, these approaches often inadequately explore structural information when inferring unseen nodes. To this end, we introduce SimMLP, a Self-supervised framework for learning MLPs on graphs, designed to fully integrate rich structural information into MLPs. Notably, SimMLP is the first MLP-learning method that can achieve equivalence to GNNs in the optimal case. The key idea is to employ self-supervised learning to align the representations encoded by graph context-aware GNNs and neighborhood dependency-free MLPs, thereby fully integrating the structural information into MLPs. We provide a comprehensive theoretical analysis, demonstrating the equivalence between SimMLP and GNNs based on mutual information and inductive bias, highlighting SimMLP's advanced structural learning capabilities. Additionally, we conduct extensive experiments on 20 benchmark datasets, covering node classification, link prediction, and graph classification, to showcase SimMLP's superiority over state-of-the-art baselines, particularly in scenarios involving unseen nodes (e.g., inductive and cold-start node classification) where structural insights are crucial. Our codes are available at: https://github.com/Zehong-Wang/SimMLP.

Training MLPs on Graphs without Supervision

TL;DR

SimMLP tackles the latency bottleneck of neighborhood-based GNN inference by training MLPs on graphs through self-supervised alignment with a GNN encoder. The method defines a mutual-information–driven objective that links structure-aware GNN embeddings with structure-free MLP embeddings, augmented by a reconstruction term and safeguards against collapse. The authors prove an information-theoretic basis for why SimMLP can be equivalent to GNNs in the optimal limit and demonstrate strong empirical performance across node classification, link prediction, and graph classification, with particular gains in inductive and cold-start scenarios. Importantly, SimMLP yields large inference speedups (roughly 90–126×) and robust performance under feature/edge noise and scarce labels, enabling scalable and versatile graph learning with MLPs. , together with mutual information considerations, underpins the theoretical and practical contributions.

Abstract

Graph Neural Networks (GNNs) have demonstrated their effectiveness in various graph learning tasks, yet their reliance on neighborhood aggregation during inference poses challenges for deployment in latency-sensitive applications, such as real-time financial fraud detection. To address this limitation, recent studies have proposed distilling knowledge from teacher GNNs into student Multi-Layer Perceptrons (MLPs) trained on node content, aiming to accelerate inference. However, these approaches often inadequately explore structural information when inferring unseen nodes. To this end, we introduce SimMLP, a Self-supervised framework for learning MLPs on graphs, designed to fully integrate rich structural information into MLPs. Notably, SimMLP is the first MLP-learning method that can achieve equivalence to GNNs in the optimal case. The key idea is to employ self-supervised learning to align the representations encoded by graph context-aware GNNs and neighborhood dependency-free MLPs, thereby fully integrating the structural information into MLPs. We provide a comprehensive theoretical analysis, demonstrating the equivalence between SimMLP and GNNs based on mutual information and inductive bias, highlighting SimMLP's advanced structural learning capabilities. Additionally, we conduct extensive experiments on 20 benchmark datasets, covering node classification, link prediction, and graph classification, to showcase SimMLP's superiority over state-of-the-art baselines, particularly in scenarios involving unseen nodes (e.g., inductive and cold-start node classification) where structural insights are crucial. Our codes are available at: https://github.com/Zehong-Wang/SimMLP.

Paper Structure

This paper contains 34 sections, 3 theorems, 12 equations, 8 figures, 21 tables.

Key Result

Proposition 4.1

Suppose ${\mathcal{G}} = ({\mathbf{A}}, {\mathbf{X}})$ is sampled from a latent graph ${\mathcal{G}}_{\mathcal{I}} = ({\mathbf{A}}, {\mathbf{F}})$, ${\mathcal{G}} \sim P({\mathcal{G}}_{\mathcal{I}})$, and ${\mathbf{F}}^*$ is the lossless compression of ${\mathbf{F}}$ that $\mathbb{E}[{\mathbf{X}} |

Figures (8)

  • Figure 1: Accuracy vs. Inference Time on Arxiv dataset under cold-start setting.
  • Figure 2: The overview of SimMLP. During pre-training, SimMLP uses GNN and MLP encoders to obtain node embeddings individually, and employs a self-supervised loss to maximize their alignment. To prevent the risk of trivial solutions, SimMLP further applies two strategies discussed in Section \ref{['sec:prevent trivial solution']}. During inference, SimMLP utilizes the pre-trained MLP to encode node embeddings, achieving significant acceleration by avoiding fetching neighborhood.
  • Figure 3: Model collapse happens if naively applying the alignment loss (Equation \ref{['eq:objective']}). The strategies proposed in Sec. \ref{['sec:prevent trivial solution']} prevents the model collapse.
  • Figure 4: Node classification on heterophily graphs.
  • Figure 5: Link prediction performance.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Proposition 4.1
  • Lemma 4.2
  • Proposition 4.3