Data Exchange Markets via Utility Balancing
Aditya Bhaskara, Sreenivas Gollapudi, Sungjin Im, Kostas Kollias, Kamesh Munagala, Govind S. Sankar
TL;DR
The paper designs a data-exchange market without monetary transfers that balances interim utility across heterogeneous datasets and ML tasks via a central clearinghouse. It develops a formal Data Exchange Problem with sharing rules (notably Shapley value and proportional sharing) and proves NP-hardness of welfare maximization, then provides polynomial-time approximation algorithms using a multiplicative weights framework, including results for submodular and concave utilities. The work further shows core stability results, including existence of ε-approximate cores and efficient 2-stable Greedy constructions, and analyzes strategic behavior with respect to welfare; it also extends to imbalanced exchanges. Empirical validation on road-traffic mean-estimation tasks demonstrates substantial welfare gains over pairwise trading baselines, illustrating practical impact for collaborative data sharing in heterogeneous ML settings.
Abstract
This paper explores the design of a balanced data-sharing marketplace for entities with heterogeneous datasets and machine learning models that they seek to refine using data from other agents. The goal of the marketplace is to encourage participation for data sharing in the presence of such heterogeneity. Our market design approach for data sharing focuses on interim utility balance, where participants contribute and receive equitable utility from refinement of their models. We present such a market model for which we study computational complexity, solution existence, and approximation algorithms for welfare maximization and core stability. We finally support our theoretical insights with simulations on a mean estimation task inspired by road traffic delay estimation.
