Segmented Private Data Aggregation in the Multi-message Shuffle Model
Shaowei Wang, Hongqiao Chen, Sufen Zeng, Ruilin Yang, Hui Jiang, Peigen Ye, Kaiqi Yu, Rundong Mei, Shaozheng Huang, Wei Yang, Bangzhou Xin
TL;DR
This work addresses the need for flexible privacy protection in decentralized data collection by introducing agnostic segmented privacy within the multi-message shuffle differential privacy framework. The authors decouple user data from privacy-level choices, anonymize privacy preferences, and optimize the use of blanket messages to improve aggregation utility while maintaining DP guarantees. They provide a concrete protocol for set-valued data analyses with almost-tight privacy amplification bounds and validate substantial utility gains—up to about 50% reduction in estimation error—in both real and synthetic datasets. The approach offers practical benefits for decentralized data analytics, enabling heterogeneous privacy preferences without sacrificing privacy or utility and requiring only a few messages per user across two interaction rounds.
Abstract
The shuffle model of differential privacy (DP) offers compelling privacy-utility trade-offs in decentralized settings (e.g., internet of things, mobile edge networks). Particularly, the multi-message shuffle model, where each user may contribute multiple messages, has shown that accuracy can approach that of the central model of DP. However, existing studies typically assume a uniform privacy protection level for all users, which may deter conservative users from participating and prevent liberal users from contributing more information, thereby reducing the overall data utility, such as the accuracy of aggregated statistics. In this work, we pioneer the study of segmented private data aggregation within the multi-message shuffle model of DP, introducing flexible privacy protection for users and enhanced utility for the aggregation server. Our framework not only protects users' data but also anonymizes their privacy level choices to prevent potential data leakage from these choices. To optimize the privacy-utility-communication trade-offs, we explore approximately optimal configurations for the number of blanket messages and conduct almost tight privacy amplification analyses within the shuffle model. Through extensive experiments, we demonstrate that our segmented multi-message shuffle framework achieves a reduction of about 50\% in estimation error compared to existing approaches, significantly enhancing both privacy and utility.
