Diffusion-Assisted Distillation for Self-Supervised Graph Representation Learning with MLPs
Seong Jin Ahn, Myoung-Ho Kim
TL;DR
DAD-SGM introduces a diffusion-based teacher assistant to bridge the capacity gap when distilling self-supervised GNN knowledge into MLPs. The method trains an MLP-denoising diffusion model to predict noise from the teacher and then distills the teacher into a student MLP by aligning diffusion-noise predictions, enabling scalable, robust self-supervised graph representations. Empirically, it yields up to 15% node-classification and 19% link-prediction gains over prior GNN-to-MLP distillation methods while maintaining fast inference on large graphs, and it shows improved robustness to noise and adversarial perturbations. The work suggests practical impact for large-scale graph analysis with lightweight models and motivates extensions to heterogeneous graphs via conditional diffusion modeling.
Abstract
For large-scale applications, there is growing interest in replacing Graph Neural Networks (GNNs) with lightweight Multi-Layer Perceptrons (MLPs) via knowledge distillation. However, distilling GNNs for self-supervised graph representation learning into MLPs is more challenging. This is because the performance of self-supervised learning is more related to the model's inductive bias than supervised learning. This motivates us to design a new distillation method to bridge a huge capacity gap between GNNs and MLPs in self-supervised graph representation learning. In this paper, we propose \textbf{D}iffusion-\textbf{A}ssisted \textbf{D}istillation for \textbf{S}elf-supervised \textbf{G}raph representation learning with \textbf{M}LPs (DAD-SGM). The proposed method employs a denoising diffusion model as a teacher assistant to better distill the knowledge from the teacher GNN into the student MLP. This approach enhances the generalizability and robustness of MLPs in self-supervised graph representation learning. Extensive experiments demonstrate that DAD-SGM effectively distills the knowledge of self-supervised GNNs compared to state-of-the-art GNN-to-MLP distillation methods. Our implementation is available at https://github.com/SeongJinAhn/DAD-SGM.
