Table of Contents
Fetching ...

Network classification through random walks

Gonzalo Travieso, Joao Merenda, Odemir M. Bruno

TL;DR

This work tackles network classification by extracting features from random-walk dynamics on graphs, using traditional RW, self-avoiding walks (SAW), and limited-memory SAW to produce node-visit statistics and walk-length statistics. These random-walk features are compared against four alternative feature sets—structural measures, Life-Like Network Automata (LLNA), deterministic tourist walk (DTW), and Graph2Vec—within a linear discriminant analysis (LDA) framework evaluated by 10-fold cross-validation. Across twelve diverse datasets, SAW-based features consistently provide strong classification performance, often outperforming baselines and showing robustness to noise; memory-based features offer diminishing returns and incur higher costs. The findings suggest that SAW-driven statistics offer a scalable and effective approach for network type discrimination, with potential gains from extending features (e.g., first-visit times) and addressing scalability via walk-length constraints. Overall, the proposed random-walk feature approach delivers competitive accuracy and resilience across domains, highlighting the value of dynamic, structure-aware representations for network classification.

Abstract

Network models have been widely used to study diverse systems and analyze their dynamic behaviors. Given the structural variability of networks, an intriguing question arises: Can we infer the type of system represented by a network based on its structure? This classification problem involves extracting relevant features from the network. Existing literature has proposed various methods that combine structural measurements and dynamical processes for feature extraction. In this study, we introduce a novel approach to characterize networks using statistics from random walks, which can be particularly informative about network properties. We present the employed statistical metrics and compare their performance on multiple datasets with other state-of-the-art feature extraction methods. Our results demonstrate that the proposed method is effective in many cases, often outperforming existing approaches, although some limitations are observed across certain datasets.

Network classification through random walks

TL;DR

This work tackles network classification by extracting features from random-walk dynamics on graphs, using traditional RW, self-avoiding walks (SAW), and limited-memory SAW to produce node-visit statistics and walk-length statistics. These random-walk features are compared against four alternative feature sets—structural measures, Life-Like Network Automata (LLNA), deterministic tourist walk (DTW), and Graph2Vec—within a linear discriminant analysis (LDA) framework evaluated by 10-fold cross-validation. Across twelve diverse datasets, SAW-based features consistently provide strong classification performance, often outperforming baselines and showing robustness to noise; memory-based features offer diminishing returns and incur higher costs. The findings suggest that SAW-driven statistics offer a scalable and effective approach for network type discrimination, with potential gains from extending features (e.g., first-visit times) and addressing scalability via walk-length constraints. Overall, the proposed random-walk feature approach delivers competitive accuracy and resilience across domains, highlighting the value of dynamic, structure-aware representations for network classification.

Abstract

Network models have been widely used to study diverse systems and analyze their dynamic behaviors. Given the structural variability of networks, an intriguing question arises: Can we infer the type of system represented by a network based on its structure? This classification problem involves extracting relevant features from the network. Existing literature has proposed various methods that combine structural measurements and dynamical processes for feature extraction. In this study, we introduce a novel approach to characterize networks using statistics from random walks, which can be particularly informative about network properties. We present the employed statistical metrics and compare their performance on multiple datasets with other state-of-the-art feature extraction methods. Our results demonstrate that the proposed method is effective in many cases, often outperforming existing approaches, although some limitations are observed across certain datasets.

Paper Structure

This paper contains 13 sections, 3 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Classification performance of the limited-memory random walk as a function of the memory parameter $m$ in the metabolic datasets. (a) Accuracy obtained using individual memory values ($m = 1$ to $10$) in isolation. (b) Accuracy obtained using cumulative combinations of memory values, incrementally adding from $m = 1$ to $m = 10$. (c) Average accuracy across all metabolic datasets as the memory increases, comparing individual memory usage and memory combinations.
  • Figure 2: Classification performance under increasing noise levels. The proposed Random Walk-based method (using the sas and sav feature sets) is compared against state-of-the-art methods.