Table of Contents
Fetching ...

An adversarially robust data-market for spatial, crowd-sourced data

Aida Manzano Kharman, Christian Jursitzky, Quan Zhou, Pietro Ferraro, Jakub Marecek, Pierre Pinson, Robert Shorten

TL;DR

The paper addresses fairness, privacy, and reliability challenges in crowd-sourced data markets by proposing a hybrid data-market architecture that embeds verification, reputation-based MEV voting, privacy-preserving data consensus, adaptive Shapley-value-based proof-of-work, and a distributed ledger with NFTs. It contributes a novel combination of verifiable data ownership, a scalable MEV variant (C-MEV), and a mean-median data-consensus robust to adversaries, validated through simulations in a smart mobility use-case. The results show enhanced resilience to data-poisoning attacks, with higher practical breakdown points when combining MEV with data consensus and a strong reputation system. The work has practical implications for deploying resilient, fair, and privacy-preserving data markets in crowd-sourced sensing applications, enabled by smart contracts and NFTs for access control and ownership management.

Abstract

We describe an architecture for a decentralised data market for applications in which agents are incentivised to collaborate to crowd-source their data. The architecture is designed to reward data that furthers the market's collective goal, and distributes reward fairly to all those that contribute with their data. We show that the architecture is resilient to Sybil, wormhole, and data poisoning attacks. In order to evaluate the resilience of the architecture, we characterise its breakdown points for various adversarial threat models in an automotive use case.

An adversarially robust data-market for spatial, crowd-sourced data

TL;DR

The paper addresses fairness, privacy, and reliability challenges in crowd-sourced data markets by proposing a hybrid data-market architecture that embeds verification, reputation-based MEV voting, privacy-preserving data consensus, adaptive Shapley-value-based proof-of-work, and a distributed ledger with NFTs. It contributes a novel combination of verifiable data ownership, a scalable MEV variant (C-MEV), and a mean-median data-consensus robust to adversaries, validated through simulations in a smart mobility use-case. The results show enhanced resilience to data-poisoning attacks, with higher practical breakdown points when combining MEV with data consensus and a strong reputation system. The work has practical implications for deploying resilient, fair, and privacy-preserving data markets in crowd-sourced sensing applications, enabled by smart contracts and NFTs for access control and ownership management.

Abstract

We describe an architecture for a decentralised data market for applications in which agents are incentivised to collaborate to crowd-source their data. The architecture is designed to reward data that furthers the market's collective goal, and distributes reward fairly to all those that contribute with their data. We show that the architecture is resilient to Sybil, wormhole, and data poisoning attacks. In order to evaluate the resilience of the architecture, we characterise its breakdown points for various adversarial threat models in an automotive use case.
Paper Structure (22 sections, 9 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 22 sections, 9 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: Data Market Architecture. Credit for the images is given in
  • Figure 2: Example of an election performed with C-MEV with parameters $J=3$, $K = 3$.
  • Figure 3: Access Control Mechanism: Agents in coalition $C_{q_1}$ must compute the Shapley value, $\psi(C_{q_2})$, of the new incoming data, $X_{t+1}$, using $X_{t+1}$ and $v_{C_{q_1}}(\cdot)$. Once the agents of $C_{q_2}$ receive their Shapley value, they must complete an amount of proof-of-work that is inversely proportional to their Shapley value. This work is to compute the Shapley value of $\psi(X_{t+2})$ and so on. Once a coalition completes this work they may enter the market.
  • Figure 4: SHAP Values of the samples of each feature
  • Figure 5: Characterisation of data consensus algorithms' behaviour under different degrees of coordinated data poisoning attacks.
  • ...and 2 more figures

Theorems & Definitions (10)

  • Definition 4.1: Vote
  • Definition 4.2: Aggregation of Votes
  • Definition 4.3: Agent Ordering
  • Definition 4.4: Ordering Set
  • Definition 4.5: Representative Probability
  • Definition 4.6: Probability Measure of Ordering Set
  • Remark
  • Definition 5.1: Sybil Attack
  • Definition 5.2: Wormhole Attack
  • Definition 5.3: Data Poisoning