Estimation of Graph Features Based on Random Walks Using Neighbors' Properties
Tsuyoshi Hasegawa, Shiori Hironaka, Kazuyuki Shudo
TL;DR
The paper tackles estimating features of large, unknown directed social networks when API query budgets constrain neighbor acquisition. It introduces a probabilistic adjacent-node sampling scheme within a random-walk framework, parameterized by $\alpha$, and formalizes the process as a Markov chain on an expanded state space. By reweighting sampled data with $w(e_{ij})=\frac{1}{d_{\mathrm{sum}}(v_j)}$ and applying $g(e_{ij})=f(v_j)$, the method achieves unbiased feature estimates that converge to the uniform expectation as sampling progresses. Empirical results on real and synthetic graphs show the proposed approach outperforms established methods, with accuracy improving as $\alpha\to 1$ and higher query budgets yield better estimates, demonstrating practical gains in cost-constrained OSN feature estimation. The work provides a principled, scalable way to exploit adjacent-node information to reduce API costs while maintaining estimation quality in directed networks.
Abstract
Using random walks for sampling has proven advantageous in assessing the characteristics of large and unknown social networks. Several algorithms based on random walks have been introduced in recent years. In the practical application of social network sampling, there is a recurrent reliance on an application programming interface (API) for obtaining adjacent nodes. However, owing to constraints related to query frequency and associated API expenses, it is preferable to minimize API calls during the feature estimation process. In this study, considering the acquisition of neighboring nodes as a cost factor, we introduce a feature estimation algorithm that outperforms existing algorithms in terms of accuracy. Through experiments that simulate sampling on known graphs, we demonstrate the superior accuracy of our proposed algorithm when compared to existing alternatives.
