Table of Contents
Fetching ...

Protecting Data Buyer Privacy in Data Markets

Minxing Zhang, Jian Pei

TL;DR

This paper tackles the problem of protecting data buyers' privacy in data markets, an area largely neglected by prior work focused on data owners and third parties. It formalizes buyer privacy using a true intent $V_1\wedge\cdots\wedge V_n$ and a published intent $U_1\wedge\cdots\wedge U_n$, and introduces three attacker models—PI-uniform, efficiency maximization, and purchased record inference—within a $\lambda$-privacy framework. It proposes an expansion-based protection method and multiple allocation strategies to minimize disclosure while balancing purchase cost, and validates the approach with extensive experiments on real (Adult) and synthetic datasets, showing substantial privacy gains with manageable utility loss. The results provide actionable guidance on how dimensionality, true intent size, and parameter settings ($\lambda$, $\alpha$) influence privacy-utility trade-offs, supporting practical deployment in data marketplaces.

Abstract

Data markets serve as crucial platforms facilitating data discovery, exchange, sharing, and integration among data users and providers. However, the paramount concern of privacy has predominantly centered on protecting privacy of data owners and third parties, neglecting the challenges associated with protecting the privacy of data buyers. In this article, we address this gap by modeling the intricacies of data buyer privacy protection and investigating the delicate balance between privacy and purchase cost. Through comprehensive experimentation, our results yield valuable insights, shedding light on the efficacy and efficiency of our proposed approaches.

Protecting Data Buyer Privacy in Data Markets

TL;DR

This paper tackles the problem of protecting data buyers' privacy in data markets, an area largely neglected by prior work focused on data owners and third parties. It formalizes buyer privacy using a true intent and a published intent , and introduces three attacker models—PI-uniform, efficiency maximization, and purchased record inference—within a -privacy framework. It proposes an expansion-based protection method and multiple allocation strategies to minimize disclosure while balancing purchase cost, and validates the approach with extensive experiments on real (Adult) and synthetic datasets, showing substantial privacy gains with manageable utility loss. The results provide actionable guidance on how dimensionality, true intent size, and parameter settings (, ) influence privacy-utility trade-offs, supporting practical deployment in data marketplaces.

Abstract

Data markets serve as crucial platforms facilitating data discovery, exchange, sharing, and integration among data users and providers. However, the paramount concern of privacy has predominantly centered on protecting privacy of data owners and third parties, neglecting the challenges associated with protecting the privacy of data buyers. In this article, we address this gap by modeling the intricacies of data buyer privacy protection and investigating the delicate balance between privacy and purchase cost. Through comprehensive experimentation, our results yield valuable insights, shedding light on the efficacy and efficiency of our proposed approaches.
Paper Structure (22 sections, 22 equations, 4 figures, 4 tables)

This paper contains 22 sections, 22 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: A Running Example to Illustrate the Expansion Method
  • Figure 2: Effect of True Intent Size on Attacker's Confidence, Number of Records Purchased in True Intent (TI) and Published intent (PI), and Published Intent Size regarding Dimensions "Age", "Ethnicity", and "Hours per Week" for PI-uniform Attack (PI-uniform) and Efficiency Maximization Attack (EM).
  • Figure 3: Effect of Privacy Threshold $\lambda$.
  • Figure 4: Effect of Weight Parameter $\alpha$