Table of Contents
Fetching ...

Towards Data Auctions with Externalities

Anish Agarwal, Munther Dahleh, Thibaut Horel, Maryann Rui

TL;DR

It is demonstrated that modeling the utility of firms solely through the increase in prediction accuracy experienced reduces the complex, combinatorial problem of allocating and pricing multiple data sets to an auction of a single digital (freely replicable) good.

Abstract

The design of data markets has gained importance as firms increasingly use machine learning models fueled by externally acquired training data. A key consideration is the externalities firms face when data, though inherently freely replicable, is allocated to competing firms. In this setting, we demonstrate that a data seller's optimal revenue increases as firms can pay to prevent allocations to others. To do so, we first reduce the combinatorial problem of allocating and pricing multiple datasets to the auction of a single digital good by modeling utility for data through the increase in prediction accuracy it provides. We then derive welfare and revenue maximizing mechanisms, highlighting how the form of firms' private information - whether the externalities one exerts on others is known, or vice-versa - affects the resulting structures. In all cases, under appropriate assumptions, the optimal allocation rule is a single threshold per firm, where either all data is allocated or none is.

Towards Data Auctions with Externalities

TL;DR

It is demonstrated that modeling the utility of firms solely through the increase in prediction accuracy experienced reduces the complex, combinatorial problem of allocating and pricing multiple data sets to an auction of a single digital (freely replicable) good.

Abstract

The design of data markets has gained importance as firms increasingly use machine learning models fueled by externally acquired training data. A key consideration is the externalities firms face when data, though inherently freely replicable, is allocated to competing firms. In this setting, we demonstrate that a data seller's optimal revenue increases as firms can pay to prevent allocations to others. To do so, we first reduce the combinatorial problem of allocating and pricing multiple datasets to the auction of a single digital good by modeling utility for data through the increase in prediction accuracy it provides. We then derive welfare and revenue maximizing mechanisms, highlighting how the form of firms' private information - whether the externalities one exerts on others is known, or vice-versa - affects the resulting structures. In all cases, under appropriate assumptions, the optimal allocation rule is a single threshold per firm, where either all data is allocated or none is.

Paper Structure

This paper contains 70 sections, 18 theorems, 105 equations, 2 figures, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

The mechanism with allocation function outside option $x_j(t_i = \emptyset, \boldsymbol t_{-i}) = \ind*{W_j^i \geq 0}$ and payment function where $W_j^i$ is defined as the welfare contribution of bidder $j$ when (only) bidder $i$ chooses to not participate in the auction to be, for $j \in N \backslash i$: maximizes the welfare among all DSIC and ex-post IR auctions, and has no positive transfer

Figures (2)

  • Figure 1: Partition of type space by welfare versus revenue maximizing restricted dependency allocations in Scenario 1, assuming $v_1$ and $\eta_{2\gets 1}$ are uniformly distributed on their respective domains $[0, 3]$ and $[0,2]$. The shaded regions denote where bidder 1 is allocated the entire dataset ($x_1 = 1$) and the un-shaded regions correspond to the opposite case of $x_1 = 0$.
  • Figure 2: Contribution $R_i$ to total revenue due to the presence of bidder $i$ as a function of the externality caused by bidder $i$: $\sum_{j\in N\backslash i} \eta_{j \gets i}$. In black, the value of $R_i$ in the optimal mechanism, given by Equation \ref{['eq:rev_vs_ext']}. In red, a suboptimal mechanism that only charges the optimal threshold $\tau_i$ when allocating to bidder $i$, without extracting additional revenue from bidders $j\neq i$ corresponding to the threat of their outside option.

Theorems & Definitions (36)

  • Example 2.3
  • Remark 2.4: Mapping allocations to data subsets.
  • Remark 2.5
  • Definition 2.6: Dominant Strategy Incentive Compatibility
  • Definition 2.7: Ex-Post Individual Rationality
  • Definition 2.8: Bayes--Nash Incentive Compatibility
  • Definition 2.9: Interim Individual Rationality
  • Theorem 3.1: Efficient Mechanism, Scenario 1
  • Proposition 3.2: Impossibility of Ex-Post Optimality
  • Theorem 3.3: Efficient Mechanism, Scenario 2
  • ...and 26 more