Equilibrium of Data Markets with Externality
Safwan Hossain, Yiling Chen
TL;DR
This work analyzes fixed-price data markets where replication enables multiple buyers to purchase the same data, and where negative externalities across buyers affect welfare. It shows that, without intervention, pure Nash equilibria can yield poor social welfare, but a simple platform policy—charging a transaction cost based on predicted externalities—induces a dominant-strategy equilibrium and yields welfare close to optimal under a standard externality model; the welfare gap scales with $n(1-\alpha)$ and is robust to prediction bias. The paper then extends to an online setting where valuations are learned over time using a zooming bandit algorithm in a Hamming-space of seller-bundle offers, proving instance-dependent regret bounds and preserving sublinear welfare regret. A richer joint externality model is explored, showing that while guarantees weaken, the transaction-cost intervention still provides meaningful welfare improvements and can yield $oldsymbol{\varepsilon}$-PNE with WRaE bounded by $n/2$ under reasonable parameter choices. Overall, the results suggest a simple, revenue-neutral policy can align individual incentives with social welfare in data markets, with practical relevance for platforms like AWS Data Exchange and Snowflake Data Marketplace.
Abstract
We model real-world data markets, where sellers post fixed prices and buyers are free to purchase from any set of sellers, as a simultaneous game. A key component here is the negative externality buyers induce on one another due to data purchases. Starting with a simple setting where buyers know their valuations a priori, we characterize both the existence and welfare properties of the pure Nash equilibrium in the presence of such externality. While the outcomes are bleak without any intervention, mirroring the limitations of current data markets, we prove that for a standard class of externality functions, platforms intervening through a transaction cost can lead to a pure equilibrium with strong welfare guarantees. We next consider a more realistic setting where buyers learn their valuations over time through market interactions. Our intervention is feasible here as well, and we consider learning algorithms to achieve low regret concerning both individual and cumulative utility metrics. Lastly, we analyze the promises of this intervention under a much richer externality model.
