Table of Contents
Fetching ...

ORBITAAL: A Temporal Graph Dataset of Bitcoin Entity-Entity Transactions

Célestin Coquidé, Rémy Cazabet

TL;DR

This work presents ORBITAAL, the first comprehensive dataset based on temporal graph formalism, which covers all Bitcoin transactions from January 2009 to January 2021 and provides temporal graph representations of entity-entity transaction networks, snapshots, and stream graph.

Abstract

Research on Bitcoin (BTC) transactions is a matter of interest for both economic and network science fields. Although this cryptocurrency is based on a decentralized system, making transaction details freely accessible, making raw blockchain data analyzable is not straightforward due to the Bitcoin protocol specificity and data richness. To address the need for an accessible dataset, we present ORBITAAL, the first comprehensive dataset based on temporal graph formalism. The dataset covers all Bitcoin transactions from January 2009 to January 2021. ORBITAAL provides temporal graph representations of entity-entity transaction networks, snapshots and stream graph. Each transaction value is given in Bitcoin and US dollar regarding daily-based conversion rate. This dataset also provides details on entities such as their global BTC balance and associated public addresses.

ORBITAAL: A Temporal Graph Dataset of Bitcoin Entity-Entity Transactions

TL;DR

This work presents ORBITAAL, the first comprehensive dataset based on temporal graph formalism, which covers all Bitcoin transactions from January 2009 to January 2021 and provides temporal graph representations of entity-entity transaction networks, snapshots, and stream graph.

Abstract

Research on Bitcoin (BTC) transactions is a matter of interest for both economic and network science fields. Although this cryptocurrency is based on a decentralized system, making transaction details freely accessible, making raw blockchain data analyzable is not straightforward due to the Bitcoin protocol specificity and data richness. To address the need for an accessible dataset, we present ORBITAAL, the first comprehensive dataset based on temporal graph formalism. The dataset covers all Bitcoin transactions from January 2009 to January 2021. ORBITAAL provides temporal graph representations of entity-entity transaction networks, snapshots and stream graph. Each transaction value is given in Bitcoin and US dollar regarding daily-based conversion rate. This dataset also provides details on entities such as their global BTC balance and associated public addresses.
Paper Structure (31 sections, 4 equations, 8 figures, 1 table)

This paper contains 31 sections, 4 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Illustrations of the change of representation between the original Blockchain Data and our Stream Graph representation. Note the loss of information: the original data (left panel) keeps track of the exact flow, e.g., that coins sent by U1 to U2 come from U3 and U4, not U5. On the contrary, in the user-graph representation (right panel), this information is lost. However, in most cases, this information can be considered as noise, and the user-network representation is more faithful to how users behave in the network.
  • Figure 2: Illustrations of snapshot (left panel) and stream graph (right panel) temporal network representation of Bitcoin transfers between users. The orange-colored node represents the mining node and dotted links are associated with fee transactions.
  • Figure 3: Comparing the daily paid fees (top left panel) and its cumulative (top right panel) obtained from ORBITAAL dataset (blue) with data from blockchain.com (red). The bottom panels represent the relative error between both datasets such that positive (resp. negative) $\delta_r$ value indicates a higher accuracy for ORBITAAL (the dataset of reference.
  • Figure 4: Comparing the daily total transaction outputs in BTC (top left panel) and its cumulative (top right panel) obtained from ORBITAAL dataset (blue) with data from blockchain.com (red). The bottom panels represent the relative error between both datasets such that positive (resp. negative) $\delta_r$ value indicates a higher accuracy for ORBITAAL (the dataset of reference.
  • Figure 5: Comparing the daily number of distinct transactions (top left panel) and its cumulative (top right panel) obtained from ORBITAAL dataset (blue) with data from blockchain.com (red). The bottom panels represent the relative error between both datasets such that positive (resp. negative) $\delta_r$ value indicates a higher accuracy for ORBITAAL (the dataset of reference.
  • ...and 3 more figures