Estimation of Graph Features Based on Random Walks Using Neighbors' Properties

Tsuyoshi Hasegawa; Shiori Hironaka; Kazuyuki Shudo

Estimation of Graph Features Based on Random Walks Using Neighbors' Properties

Tsuyoshi Hasegawa, Shiori Hironaka, Kazuyuki Shudo

TL;DR

The paper tackles estimating features of large, unknown directed social networks when API query budgets constrain neighbor acquisition. It introduces a probabilistic adjacent-node sampling scheme within a random-walk framework, parameterized by $\alpha$, and formalizes the process as a Markov chain on an expanded state space. By reweighting sampled data with $w(e_{ij})=\frac{1}{d_{\mathrm{sum}}(v_j)}$ and applying $g(e_{ij})=f(v_j)$, the method achieves unbiased feature estimates that converge to the uniform expectation as sampling progresses. Empirical results on real and synthetic graphs show the proposed approach outperforms established methods, with accuracy improving as $\alpha\to 1$ and higher query budgets yield better estimates, demonstrating practical gains in cost-constrained OSN feature estimation. The work provides a principled, scalable way to exploit adjacent-node information to reduce API costs while maintaining estimation quality in directed networks.

Abstract

Using random walks for sampling has proven advantageous in assessing the characteristics of large and unknown social networks. Several algorithms based on random walks have been introduced in recent years. In the practical application of social network sampling, there is a recurrent reliance on an application programming interface (API) for obtaining adjacent nodes. However, owing to constraints related to query frequency and associated API expenses, it is preferable to minimize API calls during the feature estimation process. In this study, considering the acquisition of neighboring nodes as a cost factor, we introduce a feature estimation algorithm that outperforms existing algorithms in terms of accuracy. Through experiments that simulate sampling on known graphs, we demonstrate the superior accuracy of our proposed algorithm when compared to existing alternatives.

Estimation of Graph Features Based on Random Walks Using Neighbors' Properties

TL;DR

, and formalizes the process as a Markov chain on an expanded state space. By reweighting sampled data with

and applying

, the method achieves unbiased feature estimates that converge to the uniform expectation as sampling progresses. Empirical results on real and synthetic graphs show the proposed approach outperforms established methods, with accuracy improving as

and higher query budgets yield better estimates, demonstrating practical gains in cost-constrained OSN feature estimation. The work provides a principled, scalable way to exploit adjacent-node information to reduce API costs while maintaining estimation quality in directed networks.

Abstract

Paper Structure (19 sections, 10 theorems, 19 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 19 sections, 10 theorems, 19 equations, 8 figures, 1 table, 1 algorithm.

Introduction
Preliminaries
Definitions and Notations
Model
Markov Chain Basics
Random Walk Sampling
Proposed Method
Probabilistic Addition of Adjacent Nodes to the Sample Sequence
Markov Chains in the Proposed Sampling Algorithm
Feature Estimation
Experiment
Experimental Setup
Relationship between $\alpha$ and Estimation Accuracy
Comparison with Existing Methods
Discussion
...and 4 more sections

Key Result

THEOREM 1

In the context of a distribution $\boldsymbol{\pi}=(\pi_i)_{i\in S}$, if the condition $\pi_j = \sum_{i\in S}\pi_i P_{i,j}$ is satisfied, it indicates that the distribution $\boldsymbol{\pi}$ serves as the steady-state distribution for a Markov chain governed by the probability transition matrix $\m

Figures (8)

Figure 1: Overview of proposed method, while gray nodes denote nodes capable of acquiring degree information and properties.
Figure 2: Average NRMSE for each feature categorized by query rate at each $\alpha$.
Figure 3: NRMSE for out-degree estimation.
Figure 4: NRMSE for random label estimation.
Figure 5: NRMSE for high degree label estimation.
...and 3 more figures

Theorems & Definitions (26)

THEOREM 1
THEOREM 2
THEOREM 3
DEFINITION 4
THEOREM 5
proof
DEFINITION 6
DEFINITION 7
DEFINITION 8
DEFINITION 9
...and 16 more

Estimation of Graph Features Based on Random Walks Using Neighbors' Properties

TL;DR

Abstract

Estimation of Graph Features Based on Random Walks Using Neighbors' Properties

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (26)