Revisiting Semi-Supervised Learning with Graph Embeddings
Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov
TL;DR
The paper addresses semi-supervised learning on graph-structured data by learning node embeddings that jointly predict labels and graph context. It introduces Planetoid, a neural-network framework with transductive and inductive variants, optimizing a combined supervised and unsupervised (context-prediction) objective. Empirical results across text classification and entity-annotation tasks show substantial improvements over prior graph-based and embedding approaches, with notable gains in the inductive setting when features are informative. This work provides a scalable approach to integrate embedding learning with task-specific supervision in graph-structured domains.
Abstract
We present a semi-supervised learning framework based on graph embeddings. Given a graph between instances, we train an embedding for each instance to jointly predict the class label and the neighborhood context in the graph. We develop both transductive and inductive variants of our method. In the transductive variant of our method, the class labels are determined by both the learned embeddings and input feature vectors, while in the inductive variant, the embeddings are defined as a parametric function of the feature vectors, so predictions can be made on instances not seen during training. On a large and diverse set of benchmark tasks, including text classification, distantly supervised entity extraction, and entity classification, we show improved performance over many of the existing models.
