Table of Contents
Fetching ...

Learn then Decide: A Learning Approach for Designing Data Marketplaces

Yingqi Gao, Jin Zhou, Hua Zhou, Yong Chen, Xiaowu Dai

TL;DR

This paper addresses pricing in data marketplaces under valuation-uncertainty by introducing MAPP, a two-stage mechanism that learns value distributions through auctions and then flats a posted price for later buyers. The approach achieves incentive compatibility and limited price discrimination, with strong regret guarantees: $O_p(n^{-1})$ per round when using historical data and a no-regret bound $\bar{R}_T = O_p(T^{-1/2}(\log T)^2)$ in online settings. It combines KDE-based learning, RDE with a structured exponential-family approximation, and functional PCA to enable robust price discovery across sequential, unlimited-supply datasets. Empirical results on simulations and real FCC AWS-3 data demonstrate improved revenue performance and faster learning while maintaining accessibility for buyers. The work advances data-market optimization by integrating auction-based learning with posted pricing, offering scalable, fair pricing in dynamic data ecosystems.

Abstract

As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price (MAPP) mechanism, a novel two-stage approach that first estimates the bidders' value distribution through auctions and then determines the optimal posted price based on the learned distribution. We establish that MAPP is individually rational and incentive-compatible, ensuring truthful bidding while balancing revenue maximization with minimal price discrimination. MAPP achieves a regret of $O_p(n^{-1})$ when incorporating historical bid data, where $n$ is the number of bids in the current round. It outperforms existing methods while imposing weaker distributional assumptions. For sequential dataset sales over $T$ rounds, we propose an online MAPP mechanism that dynamically adjusts pricing across datasets with varying value distributions. Our approach achieves no-regret learning, with the average cumulative regret converging at a rate of $O_p(T^{-1/2}(\log T)^2)$. We validate the effectiveness of MAPP through simulations and real-world data from the FCC AWS-3 spectrum auction.

Learn then Decide: A Learning Approach for Designing Data Marketplaces

TL;DR

This paper addresses pricing in data marketplaces under valuation-uncertainty by introducing MAPP, a two-stage mechanism that learns value distributions through auctions and then flats a posted price for later buyers. The approach achieves incentive compatibility and limited price discrimination, with strong regret guarantees: per round when using historical data and a no-regret bound in online settings. It combines KDE-based learning, RDE with a structured exponential-family approximation, and functional PCA to enable robust price discovery across sequential, unlimited-supply datasets. Empirical results on simulations and real FCC AWS-3 data demonstrate improved revenue performance and faster learning while maintaining accessibility for buyers. The work advances data-market optimization by integrating auction-based learning with posted pricing, offering scalable, fair pricing in dynamic data ecosystems.

Abstract

As data marketplaces become increasingly central to the digital economy, it is crucial to design efficient pricing mechanisms that optimize revenue while ensuring fair and adaptive pricing. We introduce the Maximum Auction-to-Posted Price (MAPP) mechanism, a novel two-stage approach that first estimates the bidders' value distribution through auctions and then determines the optimal posted price based on the learned distribution. We establish that MAPP is individually rational and incentive-compatible, ensuring truthful bidding while balancing revenue maximization with minimal price discrimination. MAPP achieves a regret of when incorporating historical bid data, where is the number of bids in the current round. It outperforms existing methods while imposing weaker distributional assumptions. For sequential dataset sales over rounds, we propose an online MAPP mechanism that dynamically adjusts pricing across datasets with varying value distributions. Our approach achieves no-regret learning, with the average cumulative regret converging at a rate of . We validate the effectiveness of MAPP through simulations and real-world data from the FCC AWS-3 spectrum auction.

Paper Structure

This paper contains 22 sections, 6 theorems, 8 equations, 6 figures, 1 table, 1 algorithm.

Key Result

proposition 1

The MAPP mechanism satisfies individual rationality.

Figures (6)

  • Figure 1: Illustration of an online algorithm based on the MAPP mechanism, where $T$ is the total number of rounds.
  • Figure 2: Distributions of instantaneous regrets based on data simulated from truncated Gaussian distributions over [0.9, 10.1].
  • Figure 3: Distributions of instantaneous regrets based on data simulated from beta distributions over [0.9, 10.1].
  • Figure 4: Distributions of instantaneous regrets based on data simulated from truncated exponential distributions over [0.9, 10.1].
  • Figure 5: Distributions of FCC AWS-3 Auction data.
  • ...and 1 more figures

Theorems & Definitions (6)

  • proposition 1
  • proposition 2
  • proposition 3
  • theorem 1
  • theorem 2
  • theorem 3