Learn then Decide: A Learning Approach for Designing Data Marketplaces
Yingqi Gao, Jin Zhou, Hua Zhou, Yong Chen, Xiaowu Dai
TL;DR
This paper addresses pricing in data marketplaces under valuation-uncertainty by introducing MAPP, a two-stage mechanism that learns value distributions through auctions and then flats a posted price for later buyers. The approach achieves incentive compatibility and limited price discrimination, with strong regret guarantees: $O_p(n^{-1})$ per round when using historical data and a no-regret bound $\bar{R}_T = O_p(T^{-1/2}(\log T)^2)$ in online settings. It combines KDE-based learning, RDE with a structured exponential-family approximation, and functional PCA to enable robust price discovery across sequential, unlimited-supply datasets. Empirical results on simulations and real FCC AWS-3 data demonstrate improved revenue performance and faster learning while maintaining accessibility for buyers. The work advances data-market optimization by integrating auction-based learning with posted pricing, offering scalable, fair pricing in dynamic data ecosystems.
Abstract
As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price (MAPP) mechanism, a novel two-stage approach that first estimates the bidders' value distribution through auctions and then determines the optimal posted price based on the learned distribution. We establish that MAPP is individually rational and incentive-compatible, ensuring truthful bidding while balancing revenue maximization with minimal price discrimination. MAPP achieves a regret of $O_p(n^{-1})$ when incorporating historical bid data, where $n$ is the number of bids in the current round. It outperforms existing methods while imposing weaker distributional assumptions. For sequential dataset sales over $T$ rounds, we propose an online MAPP mechanism that dynamically adjusts pricing across datasets with varying value distributions. Our approach achieves no-regret learning, with the average cumulative regret converging at a rate of $O_p(T^{-1/2}(\log T)^2)$. We validate the effectiveness of MAPP through simulations and real-world data from the FCC AWS-3 spectrum auction.
