Protecting Data Buyer Privacy in Data Markets
Minxing Zhang, Jian Pei
TL;DR
This paper tackles the problem of protecting data buyers' privacy in data markets, an area largely neglected by prior work focused on data owners and third parties. It formalizes buyer privacy using a true intent $V_1\wedge\cdots\wedge V_n$ and a published intent $U_1\wedge\cdots\wedge U_n$, and introduces three attacker models—PI-uniform, efficiency maximization, and purchased record inference—within a $\lambda$-privacy framework. It proposes an expansion-based protection method and multiple allocation strategies to minimize disclosure while balancing purchase cost, and validates the approach with extensive experiments on real (Adult) and synthetic datasets, showing substantial privacy gains with manageable utility loss. The results provide actionable guidance on how dimensionality, true intent size, and parameter settings ($\lambda$, $\alpha$) influence privacy-utility trade-offs, supporting practical deployment in data marketplaces.
Abstract
Data markets serve as crucial platforms facilitating data discovery, exchange, sharing, and integration among data users and providers. However, the paramount concern of privacy has predominantly centered on protecting privacy of data owners and third parties, neglecting the challenges associated with protecting the privacy of data buyers. In this article, we address this gap by modeling the intricacies of data buyer privacy protection and investigating the delicate balance between privacy and purchase cost. Through comprehensive experimentation, our results yield valuable insights, shedding light on the efficacy and efficiency of our proposed approaches.
