Collective Privacy Recovery: Data-sharing Coordination via Decentralized Artificial Intelligence
Evangelos Pournaras, Mark Christopher Ballandies, Stefano Bennati, Chien-fei Chen
TL;DR
This work tackles collective privacy loss by modeling data-sharing as a scarce resource and enabling decentralized coordination to minimize data sharing while maintaining service quality. It formalizes a multi-criteria, personalized-valuation framework with $k$ criteria and $m=\prod_{u=1}^k l_u$ scenarios, assigning rewards via $\hat{r}_{i,j}$ and adjusting privacy through $p_i$, all under a structured recruitment and living-lab protocol. The authors validate the approach in a large, living-lab style experiment using causal inference and cluster analysis to identify data-sharing behaviors and five signal patterns for coordination, demonstrating that coordinated data sharing can yield privacy gains with manageable costs to service providers. They further contrast valuation schemes (absolute/relative, shared/sacrificed data) and show coordinated data sharing often outperforms intrinsic or rewarded sharing in terms of privacy recovery, supported by extensive conjoint analysis and ANOVA. Overall, the study provides a scalable, AI-driven pathway for collective privacy recovery with practical implications for privacy-aware data ecosystems.
Abstract
Collective privacy loss becomes a colossal problem, an emergency for personal freedoms and democracy. But, are we prepared to handle personal data as scarce resource and collectively share data under the doctrine: as little as possible, as much as necessary? We hypothesize a significant privacy recovery if a population of individuals, the data collective, coordinates to share minimum data for running online services with the required quality. Here we show how to automate and scale-up complex collective arrangements for privacy recovery using decentralized artificial intelligence. For this, we compare for first time attitudinal, intrinsic, rewarded and coordinated data sharing in a rigorous living-lab experiment of high realism involving >27,000 real data disclosures. Using causal inference and cluster analysis, we differentiate criteria predicting privacy and five key data-sharing behaviors. Strikingly, data-sharing coordination proves to be a win-win for all: remarkable privacy recovery for people with evident costs reduction for service providers.
