Graph Neural Networks for Text Classification: A Survey
Kunze Wang, Yihao Ding, Soyeon Caren Han
TL;DR
This survey analyzes how graph neural networks have been applied to text classification by separating corpus-level and document-level graph approaches. It details graph construction, node representations, and edge features across a broad set of models up to 2023, highlighting TextGCN-like corpus graphs with PMI/TF-IDF edges and various inductive and multi-graph extensions, as well as document-level methods using local consecutive word graphs and global co-occurrence graphs. The work compares performance across standard datasets (topic and sentiment tasks), discusses datasets, metrics, and experimental design, and provides critical insights into scalability, inductive deployment, and the balance between external resources and GNN benefits. It also outlines key challenges and future directions, including integration with pretrained models, dynamic/heterogeneous graphs, and scalable inductive inference. Overall, GNN-based text classification can capture both global corpus structure and local document-level relations, offering complementary strengths to traditional sequential and transformer-based approaches.
Abstract
Text Classification is the most essential and fundamental problem in Natural Language Processing. While numerous recent text classification models applied the sequential deep learning technique, graph neural network-based models can directly deal with complex structured text data and exploit global information. Many real text classification applications can be naturally cast into a graph, which captures words, documents, and corpus global features. In this survey, we bring the coverage of methods up to 2023, including corpus-level and document-level graph neural networks. We discuss each of these methods in detail, dealing with the graph construction mechanisms and the graph-based learning process. As well as the technological survey, we look at issues behind and future directions addressed in text classification using graph neural networks. We also cover datasets, evaluation metrics, and experiment design and present a summary of published performance on the publicly available benchmarks. Note that we present a comprehensive comparison between different techniques and identify the pros and cons of various evaluation metrics in this survey.
