Graph Neural Network for Crawling Target Nodes in Social Networks
Kirill Lukyanov, Mikhail Drobyshevskiy, Danil Shaikhelislamov, Denis Turdakov
TL;DR
This work addresses efficient discovery of target nodes in unknown social graphs under a query budget by leveraging Graph Neural Networks to score candidate nodes based on local neighborhoods. A novel sample boosting technique augments training data during crawling, improving predictor quality in early stages. Empirical results show GNN-based crawlers often outperform classical predictors and exhibit reduced variance, with SAGE and GAT delivering strong performance across diverse target-topology datasets. The approach demonstrates practical potential for scalable targeted crawling in distributed or heterogeneous social networks, and points to future work in richer GNN architectures and online predictor switching.
Abstract
Social networks crawling is in the focus of active research the last years. One of the challenging task is to collect target nodes in an initially unknown graph given a budget of crawling steps. Predicting a node property based on its partially known neighbourhood is at the heart of a successful crawler. In this paper we adopt graph neural networks for this purpose and show they are competitive to traditional classifiers and are better for individual cases. Additionally we suggest a training sample boosting technique, which helps to diversify the training set at early stages of crawling and thus improves the predictor quality. The experimental study on three types of target set topology indicates GNN based approach has a potential in crawling task, especially in the case of distributed target nodes.
