You do not have to train Graph Neural Networks at all on text-attributed graphs
Kaiwen Dong, Zhichun Guo, Nitesh V. Chawla
TL;DR
This work targets semi-supervised node classification on text-attributed graphs by removing the need for gradient-based GNN training. It introduces NT-GNN, a trainless, linear method that constructs a class-specific weight matrix via a single round of message passing with virtual label nodes, effectively solving a minimum-norm linear regression in over-parameterized settings. Empirical results across nine TAG benchmarks show NT-GNN can match or surpass traditionally trained models, especially when attribute dimensions are large and labels are leveraged from both training and validation sets; it also demonstrates robustness to heterophily and dramatically reduced training time. The approach provides a principled, scalable alternative for TAG classification, linking linear subspace structure of text encodings to closed-form weight construction and inference.
Abstract
Graph structured data, specifically text-attributed graphs (TAG), effectively represent relationships among varied entities. Such graphs are essential for semi-supervised node classification tasks. Graph Neural Networks (GNNs) have emerged as a powerful tool for handling this graph-structured data. Although gradient descent is commonly utilized for training GNNs for node classification, this study ventures into alternative methods, eliminating the iterative optimization processes. We introduce TrainlessGNN, a linear GNN model capitalizing on the observation that text encodings from the same class often cluster together in a linear subspace. This model constructs a weight matrix to represent each class's node attribute subspace, offering an efficient approach to semi-supervised node classification on TAG. Extensive experiments reveal that our trainless models can either match or even surpass their conventionally trained counterparts, demonstrating the possibility of refraining from gradient descent in certain configurations.
