Table of Contents
Fetching ...

Learning on Large-scale Text-attributed Graphs via Variational Inference

Jianan Zhao, Meng Qu, Chaozhuo Li, Hao Yan, Qian Liu, Rui Li, Xing Xie, Jian Tang

TL;DR

This work tackles node classification on text-attributed graphs by coupling large language models with graph neural networks through a variational EM framework (GLEM). By alternating between E-step LM optimization and M-step GNN optimization and using pseudo-labels to distill information across modules, GLEM achieves scalable, state-of-the-art performance on large TAG benchmarks. The approach demonstrates strong empirical results across transductive and structure-free inductive settings and remains effective when scaling to large LMs, highlighting its practical impact for integrating textual semantics with graph structure. Overall, GLEM offers a principled, efficient solution for leveraging both modalities in TAGs, with broad implications for scalable multimodal graph learning.

Abstract

This paper studies learning on text-attributed graphs (TAGs), where each node is associated with a text description. An ideal solution for such a problem would be integrating both the text and graph structure information with large language models and graph neural networks (GNNs). However, the problem becomes very challenging when graphs are large due to the high computational complexity brought by training large language models and GNNs together. In this paper, we propose an efficient and effective solution to learning on large text-attributed graphs by fusing graph structure and language learning with a variational Expectation-Maximization (EM) framework, called GLEM. Instead of simultaneously training large language models and GNNs on big graphs, GLEM proposes to alternatively update the two modules in the E-step and M-step. Such a procedure allows training the two modules separately while simultaneously allowing the two modules to interact and mutually enhance each other. Extensive experiments on multiple data sets demonstrate the efficiency and effectiveness of the proposed approach.

Learning on Large-scale Text-attributed Graphs via Variational Inference

TL;DR

This work tackles node classification on text-attributed graphs by coupling large language models with graph neural networks through a variational EM framework (GLEM). By alternating between E-step LM optimization and M-step GNN optimization and using pseudo-labels to distill information across modules, GLEM achieves scalable, state-of-the-art performance on large TAG benchmarks. The approach demonstrates strong empirical results across transductive and structure-free inductive settings and remains effective when scaling to large LMs, highlighting its practical impact for integrating textual semantics with graph structure. Overall, GLEM offers a principled, efficient solution for leveraging both modalities in TAGs, with broad implications for scalable multimodal graph learning.

Abstract

This paper studies learning on text-attributed graphs (TAGs), where each node is associated with a text description. An ideal solution for such a problem would be integrating both the text and graph structure information with large language models and graph neural networks (GNNs). However, the problem becomes very challenging when graphs are large due to the high computational complexity brought by training large language models and GNNs together. In this paper, we propose an efficient and effective solution to learning on large text-attributed graphs by fusing graph structure and language learning with a variational Expectation-Maximization (EM) framework, called GLEM. Instead of simultaneously training large language models and GNNs on big graphs, GLEM proposes to alternatively update the two modules in the E-step and M-step. Such a procedure allows training the two modules separately while simultaneously allowing the two modules to interact and mutually enhance each other. Extensive experiments on multiple data sets demonstrate the efficiency and effectiveness of the proposed approach.
Paper Structure (20 sections, 9 equations, 4 figures, 5 tables)

This paper contains 20 sections, 9 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The proposed GLEM framework trains GNN and LM separately in a variational EM framework: In E-step, an LM is trained towards predicting both the gold labels and GNN-predicted pseudo-labels; In M-step, a GNN is trained by predicting both gold labels and LM-inferred pseudo-labels using the embeddings and pseudo-labels predicted by LM.
  • Figure 2: The convergence curves of GLEM on OGB datasets.
  • Figure 3: The effect of the $\alpha$ and $\beta$ for GLEM-GCN.
  • Figure 4: The effect of the $\alpha$ and $\beta$ for GLEM-RevGAT.