Active Learning for Graph Neural Networks via Node Feature Propagation
Yuexin Wu, Yichong Xu, Aarti Singh, Yiming Yang, Artur Dubrawski
TL;DR
This paper tackles label-efficient node classification on graphs by introducing FeatProp, a strategy that selects training nodes using node feature propagation through the graph followed by K-Medoids clustering. It provides a theoretical bound linking the expected loss to the geometry of propagated features and demonstrates consistent empirical gains over strong baselines across multiple benchmark graphs. The approach is robust to under-trained representations and avoids reliance on final-layer embeddings, offering practical benefits for scenarios with limited labeling budgets. Overall, FeatProp advances active learning for graph neural networks by marrying propagation-based representations with principled clustering, yielding improved performance and efficiency.
Abstract
Graph Neural Networks (GNNs) for prediction tasks like node classification or edge prediction have received increasing attention in recent machine learning from graphically structured data. However, a large quantity of labeled graphs is difficult to obtain, which significantly limits the true success of GNNs. Although active learning has been widely studied for addressing label-sparse issues with other data types like text, images, etc., how to make it effective over graphs is an open question for research. In this paper, we present an investigation on active learning with GNNs for node classification tasks. Specifically, we propose a new method, which uses node feature propagation followed by K-Medoids clustering of the nodes for instance selection in active learning. With a theoretical bound analysis we justify the design choice of our approach. In our experiments on four benchmark datasets, the proposed method outperforms other representative baseline methods consistently and significantly.
