Bayesian Regression Markets
Thomas Falconer, Jalal Kazempour, Pierre Pinson
TL;DR
The paper tackles the challenge of incentivizing data sharing for regression tasks by introducing a Bayesian regression market that accounts for parameter uncertainty through posterior predictive inferences. It extends prior work by enabling uncertainty-aware valuations, using Shapley-value-based revenue allocation, and exploring both likelihood-based and information-based designs (KL-based) to mitigate financial risk. The authors prove universal and asymptotic market properties under different designs and demonstrate, via simulations and a real-world solar irradiance case study, that KL-based information-valued approaches reduce risk and stabilize payments, especially in small-sample or nonstationary settings. The framework offers practical implications for data markets by enabling robust, uncertainty-aware compensation mechanisms that align incentives for data owners and buyers in decentralized analytics tasks.
Abstract
Although machine learning tasks are highly sensitive to the quality of input data, relevant datasets can often be challenging for firms to acquire, especially when held privately by a variety of owners. For instance, if these owners are competitors in a downstream market, they may be reluctant to share information. Focusing on supervised learning for regression tasks, we develop a regression market to provide a monetary incentive for data sharing. Our mechanism adopts a Bayesian framework, allowing us to consider a more general class of regression tasks. We present a thorough exploration of the market properties, and show that similar proposals in literature expose the market agents to sizeable financial risks, which can be mitigated in our setup.
