Table of Contents
Fetching ...

EX-Graph: A Pioneering Dataset Bridging Ethereum and X

Qian Wang, Zhen Zhang, Zemin Liu, Shengliang Lu, Bingqiao Luo, Bingsheng He

TL;DR

EX-Graph presents a pioneering, open-source dataset that bridges Ethereum on-chain transactions with X off-chain social networks through verified matching links, enabling richer analysis of blockchain activity with social context. By integrating structured Ethereum features with semantically enriched X account data (via BERT and PCA) and DeepWalk embeddings, the authors demonstrate that off-chain information significantly improves Ethereum link prediction, wash-trading detection, and matching link prediction. Empirical results show up to an 8% gain in Ethereum link prediction AUC-ROC, up to an 18% recall gain in wash-trading detection, and strong performance in predicting Ethereum-X matches (best AUC-ROC ≈ 0.74). The dataset advances graph learning and cross-domain fraud analysis, providing a foundation for future research in de-anonymizing on-chain activity through verifiable off-chain relationships.

Abstract

While numerous public blockchain datasets are available, their utility is constrained by an exclusive focus on blockchain data. This constraint limits the incorporation of relevant social network data into blockchain analysis, thereby diminishing the breadth and depth of insight that can be derived. To address the above limitation, we introduce EX-Graph, a novel dataset that authentically links Ethereum and X, marking the first and largest dataset of its kind. EX-Graph combines Ethereum transaction records (2 million nodes and 30 million edges) and X following data (1 million nodes and 3 million edges), bonding 30,667 Ethereum addresses with verified X accounts sourced from OpenSea. Detailed statistical analysis on EX-Graph highlights the structural differences between X-matched and non-X-matched Ethereum addresses. Extensive experiments, including Ethereum link prediction, wash-trading Ethereum addresses detection, and X-Ethereum matching link prediction, emphasize the significant role of X data in enhancing Ethereum analysis. EX-Graph is available at \url{https://exgraph.deno.dev/}.

EX-Graph: A Pioneering Dataset Bridging Ethereum and X

TL;DR

EX-Graph presents a pioneering, open-source dataset that bridges Ethereum on-chain transactions with X off-chain social networks through verified matching links, enabling richer analysis of blockchain activity with social context. By integrating structured Ethereum features with semantically enriched X account data (via BERT and PCA) and DeepWalk embeddings, the authors demonstrate that off-chain information significantly improves Ethereum link prediction, wash-trading detection, and matching link prediction. Empirical results show up to an 8% gain in Ethereum link prediction AUC-ROC, up to an 18% recall gain in wash-trading detection, and strong performance in predicting Ethereum-X matches (best AUC-ROC ≈ 0.74). The dataset advances graph learning and cross-domain fraud analysis, providing a foundation for future research in de-anonymizing on-chain activity through verifiable off-chain relationships.

Abstract

While numerous public blockchain datasets are available, their utility is constrained by an exclusive focus on blockchain data. This constraint limits the incorporation of relevant social network data into blockchain analysis, thereby diminishing the breadth and depth of insight that can be derived. To address the above limitation, we introduce EX-Graph, a novel dataset that authentically links Ethereum and X, marking the first and largest dataset of its kind. EX-Graph combines Ethereum transaction records (2 million nodes and 30 million edges) and X following data (1 million nodes and 3 million edges), bonding 30,667 Ethereum addresses with verified X accounts sourced from OpenSea. Detailed statistical analysis on EX-Graph highlights the structural differences between X-matched and non-X-matched Ethereum addresses. Extensive experiments, including Ethereum link prediction, wash-trading Ethereum addresses detection, and X-Ethereum matching link prediction, emphasize the significant role of X data in enhancing Ethereum analysis. EX-Graph is available at \url{https://exgraph.deno.dev/}.
Paper Structure (53 sections, 3 equations, 3 figures, 12 tables)

This paper contains 53 sections, 3 equations, 3 figures, 12 tables.

Figures (3)

  • Figure 1: An overview of EX-Graph. The left figure displays the Ethereum transaction graph, showcasing transaction relationships between Ethereum addresses. The right X graph portrays the follow relationships among X accounts. A matching link identifies that the corresponding Ethereum address and X account belong to the same entity.
  • Figure 2: Graph construction from our dataset. This process initiates with the collection of on-chain, matching, and off-chain data, from which we construct on-chain and off-chain graphs. Subsequently, we extract features for Ethereum addresses and X accounts.
  • Figure III: Four typical wash trading patterns. Nodes A, B, C and D represent Ethereum addresses involved in a cycle of transfer activities.