On the Theoretical Foundations of Data Exchange Economies

Hannaneh Akrami; Bhaskar Ray Chaudhury; Jugal Garg; Aniket Murhekar

On the Theoretical Foundations of Data Exchange Economies

Hannaneh Akrami, Bhaskar Ray Chaudhury, Jugal Garg, Aniket Murhekar

TL;DR

This work develops a theoretical foundation for data exchange economies in which data is a replicable asset, introducing fairness via utility-sharing (e.g., Shapley shares) and core-stability against coalitions. It proves the existence of exchanges that are both fair and core-stable for all monotone continuous utilities and sharing rules satisfying monotonicity, normalization, and efficiency, using a fixed-point construction on a convex-like domain $Z$ and a reciprocity-enforcing map $f$, with the fixed point mapping to a reciprocal, fair exchange. For computability, the authors embed the domain to obtain PPAD-membership and design a local-search algorithm that achieves $\varepsilon$-core-stable and $\varepsilon$-reciprocal exchanges under cross-monotone shares and $L$-Lipschitz utilities, placing the problem in CLS (PPAD $\cap$ PLS). They discuss perturbations to ensure non-satiation, present a sequence of algorithmic steps (decreasing/increasing data flows) to balance surpluses, and outline open questions for extending to supermodular utilities and decentralized data exchange. Overall, the paper offers a principled framework and computational pathways for fair, stable data exchanges, highlighting foundational directions in data economics and inviting further exploration of dynamics, decentralization, and complexity boundaries.

Abstract

The immense success of ML systems relies heavily on large-scale, high-quality data. The high demand for data has led to many paradigms that involve selling, exchanging, and sharing data, motivating the study of economic processes with data as an asset. However, data differs from classical economic assets in terms of free duplication: there is no concept of limited supply since it can be replicated at zero marginal cost. This distinction introduces fundamental differences between economic processes involving data and those concerning other assets. We study a parallel to exchange (Arrow-Debreu) markets where data is the asset. Here, agents with datasets exchange data fairly and voluntarily, aiming for mutual benefit without monetary compensation. This framework is particularly relevant for non-profit organizations that seek to improve their ML models through data exchange, yet are restricted from selling their data for profit. We propose a general framework for data exchange, built on two core principles: (i) fairness, ensuring that each agent receives utility proportional to their contribution to others; contributions are quantifiable using standard credit-sharing functions like the Shapley value, and (ii) stability, ensuring that no coalition of agents can identify an exchange among themselves which they unanimously prefer to the current exchange. We show that fair and stable exchanges exist for all monotone continuous utility functions. Next, we investigate the computational complexity of finding approximate fair and stable exchanges. We present a local search algorithm for instances with monotone submodular utility functions, where each agent contributions are measured using the Shapley value. We prove that this problem lies in CLS under mild assumptions. Our framework opens up several intriguing theoretical directions for research in data economics.

On the Theoretical Foundations of Data Exchange Economies

TL;DR

Abstract

On the Theoretical Foundations of Data Exchange Economies

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (68)