Equilibrium of Data Markets with Externality

Safwan Hossain; Yiling Chen

Equilibrium of Data Markets with Externality

Safwan Hossain, Yiling Chen

TL;DR

This work analyzes fixed-price data markets where replication enables multiple buyers to purchase the same data, and where negative externalities across buyers affect welfare. It shows that, without intervention, pure Nash equilibria can yield poor social welfare, but a simple platform policy—charging a transaction cost based on predicted externalities—induces a dominant-strategy equilibrium and yields welfare close to optimal under a standard externality model; the welfare gap scales with $n(1-\alpha)$ and is robust to prediction bias. The paper then extends to an online setting where valuations are learned over time using a zooming bandit algorithm in a Hamming-space of seller-bundle offers, proving instance-dependent regret bounds and preserving sublinear welfare regret. A richer joint externality model is explored, showing that while guarantees weaken, the transaction-cost intervention still provides meaningful welfare improvements and can yield $oldsymbol{\varepsilon}$-PNE with WRaE bounded by $n/2$ under reasonable parameter choices. Overall, the results suggest a simple, revenue-neutral policy can align individual incentives with social welfare in data markets, with practical relevance for platforms like AWS Data Exchange and Snowflake Data Marketplace.

Abstract

We model real-world data markets, where sellers post fixed prices and buyers are free to purchase from any set of sellers, as a simultaneous game. A key component here is the negative externality buyers induce on one another due to data purchases. Starting with a simple setting where buyers know their valuations a priori, we characterize both the existence and welfare properties of the pure Nash equilibrium in the presence of such externality. While the outcomes are bleak without any intervention, mirroring the limitations of current data markets, we prove that for a standard class of externality functions, platforms intervening through a transaction cost can lead to a pure equilibrium with strong welfare guarantees. We next consider a more realistic setting where buyers learn their valuations over time through market interactions. Our intervention is feasible here as well, and we consider learning algorithms to achieve low regret concerning both individual and cumulative utility metrics. Lastly, we analyze the promises of this intervention under a much richer externality model.

Equilibrium of Data Markets with Externality

TL;DR

and is robust to prediction bias. The paper then extends to an online setting where valuations are learned over time using a zooming bandit algorithm in a Hamming-space of seller-bundle offers, proving instance-dependent regret bounds and preserving sublinear welfare regret. A richer joint externality model is explored, showing that while guarantees weaken, the transaction-cost intervention still provides meaningful welfare improvements and can yield

-PNE with WRaE bounded by

under reasonable parameter choices. Overall, the results suggest a simple, revenue-neutral policy can align individual incentives with social welfare in data markets, with practical relevance for platforms like AWS Data Exchange and Snowflake Data Marketplace.

Abstract

Paper Structure (22 sections, 14 theorems, 30 equations, 2 figures, 1 algorithm)

This paper contains 22 sections, 14 theorems, 30 equations, 2 figures, 1 algorithm.

Introduction
Our contributions
Related Works
Model
Market Structure:
Buyer's Utility:
Game and Solution Concept:
Known vs Unknown Utilities:
Data Markets Game with Known Utility
Online setting with learned valuation
Algorithms
Online Effective Regret
Online welfare regret
Online welfare regret
A Richer Externality Model
...and 7 more sections

Key Result

Proposition 1

For the data markets game with no intervention - $\mathcal{T}_i(S) = 0, \,\forall i$, it is a dominant strategy for any buyer $i$ to select $\gamma_i^d = \mathop{\mathrm{arg\,max}}\limits_{\gamma}{g_i(\gamma)}$. However, there exists an instance of this game where the WRaE is maximal - (i.e. $\Theta

Figures (2)

Figure 1: Standard Externality Model: Avg % increase in social welfare (with 90% confidence interval) from gain maximizing decision.
Figure 2: Joint Externality Model: Avg % increase in social welfare (with 90% confidence interval) from gain maximizing decision.

Theorems & Definitions (34)

Definition 1: Utility and Welfare
Definition 2: WRaE
Proposition 1
proof
Definition 3: Predicted Externality
Definition 4: Transaction Cost
Theorem 1
proof
Definition 5
Definition 6
...and 24 more

Equilibrium of Data Markets with Externality

TL;DR

Abstract

Equilibrium of Data Markets with Externality

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (34)