Table of Contents
Fetching ...

Incentivizing Truthful Data Contributions in a Marketplace for Mean Estimation

Keran Chen, Alex Clinton, Kirthevasan Kandasamy

Abstract

We study a data marketplace where a broker intermediates between buyers, who seek to estimate the mean \(μ\) of an unknown normal distribution \(\Ncal(μ, σ^2)\), and contributors, who can collect data from this distribution at a cost. The broker delegates data collection work to contributors, aggregates reported datasets, sells it to buyers, and redistributes revenue as payments to contributors. We aim to maximize welfare or profit under key constraints: individual rationality for buyers and contributors, incentive compatibility (contributors are incentivized to comply with data collection instructions and truthfully report the collected data), and budget balance (total contributor payments equals total revenue). We first compute welfare/profit-optimal prices under truthful reporting; however, to incentivize data collection and truthful data reporting, we adjust them based on discrepancies in contributors' reported data. This yields a Nash equilibrium (NE) where the two lowest-cost contributors collect all data. We complement this with two hardness results: \emph{(i)} no nontrivial dominant-strategy incentive-compatible mechanism exists in this problem, and \emph{(ii)} no mechanism outperforms ours in a NE.

Incentivizing Truthful Data Contributions in a Marketplace for Mean Estimation

Abstract

We study a data marketplace where a broker intermediates between buyers, who seek to estimate the mean of an unknown normal distribution \(\Ncal(μ, σ^2)\), and contributors, who can collect data from this distribution at a cost. The broker delegates data collection work to contributors, aggregates reported datasets, sells it to buyers, and redistributes revenue as payments to contributors. We aim to maximize welfare or profit under key constraints: individual rationality for buyers and contributors, incentive compatibility (contributors are incentivized to comply with data collection instructions and truthfully report the collected data), and budget balance (total contributor payments equals total revenue). We first compute welfare/profit-optimal prices under truthful reporting; however, to incentivize data collection and truthful data reporting, we adjust them based on discrepancies in contributors' reported data. This yields a Nash equilibrium (NE) where the two lowest-cost contributors collect all data. We complement this with two hardness results: \emph{(i)} no nontrivial dominant-strategy incentive-compatible mechanism exists in this problem, and \emph{(ii)} no mechanism outperforms ours in a NE.

Paper Structure

This paper contains 23 sections, 13 theorems, 58 equations, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

For any mechanism $M$, if $s=\left\{( n_i ,X_i)\right\}_{i\in\mathcal{C}}$ is a dominant strategy profile, then no contributor collects any data, i.e., $\forall i\in\mathcal{C}$, $n'_i = n_i(N'_i, c_i) =0$.

Theorems & Definitions (15)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem 4.1
  • Remark 1
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Theorem D.1
  • Theorem D.1
  • ...and 5 more