EtherBee: A Global Dataset of Ethereum Node Performance Measurements Coupled with Honeypot Interactions and Full Network Sessions
Scott Seidenberger, Anindya Maiti
TL;DR
EtherBee presents a global, multimodal dataset that fuses Ethereum node metrics, network traffic metadata, and honeypot logs collected from ten vantage points over three months to enable holistic analysis of node performance, P2P topology, and security threats. By linking operational data with rich security telemetry, the work enables new investigations into performance, reliability, and decentralization dynamics in the Ethereum network. A key finding is that latency-based peer pruning can unintentionally centralize connectivity along major geographic and undersea cable routes, with implications for resilience and censorship resistance. The dataset and methodology offer a valuable resource for researchers studying decentralized networks, with accessible exports to tabular formats to facilitate broad reuse.
Abstract
We introduce EtherBee, a global dataset integrating detailed Ethereum node metrics, network traffic metadata, and honeypot interaction logs collected from ten geographically diverse vantage points over three months. By correlating node data with granular network sessions and security events, EtherBee provides unique insights into benign and malicious activity, node stability, and network-level threats in the Ethereum peer-to-peer network. A case study shows how client-based optimizations can unintentionally concentrate the network geographically, impacting resilience and censorship resistance. We publicly release EtherBee to promote further investigations into performance, reliability, and security in decentralized networks.
