Network Sampling: An Overview and Comparative Analysis
Quoc Chuong Nguyen
TL;DR
The paper investigates how well different network sampling methods preserve structural properties in static versus temporal networks. By comparing node-based, edge-based, and exploration-based approaches on a static CA-HepTh collaboration network and a temporal CollegeMsg network, it shows that no single method consistently preserves metrics across contexts. Advanced strategies perform best on static graphs, while simpler methods can outperform them in temporal settings, underscoring the need for context-aware, metric-driven sampling choices. The findings offer practical guidance for researchers selecting sampling methods tailored to network type and analytical goals, and point to future work on adaptive sampling for evolving systems, broader datasets, and metric-specific strategies. $
Abstract
Network sampling is a crucial technique for analyzing large or partially observable networks. However, the effectiveness of different sampling methods can vary significantly depending on the context. In this study, we empirically compare representative methods from three main categories: node-based, edge-based, and exploration-based sampling. We used two real-world datasets for our analysis: a scientific collaboration network and a temporal message-sending network. Our results indicate that no single sampling method consistently outperforms the others in both datasets. Although advanced methods tend to provide better accuracy on static networks, they often perform poorly on temporal networks, where simpler techniques can be more effective. These findings suggest that the best sampling strategy depends not only on the structural characteristics of the network but also on the specific metrics that need to be preserved or analyzed. Our work offers practical insights for researchers in choosing sampling approaches that are tailored to different types of networks and analytical objectives.
