Table of Contents
Fetching ...

Unified Locational Differential Privacy Framework

Aman Priyanshu, Yash Maurya, Suriya Ganesh, Vy Tran

TL;DR

We address private geographical data analysis by proposing a unified locational differential privacy framework that supports diverse data types (one-hot vectors, booleans, integers, and floats) over geographic regions. The framework uses local DP mechanisms—randomized response, the exponential mechanism, and the Gaussian mechanism—implemented with Diffprivlib and tracked with Opacus, including shuffling for privacy amplification. It is evaluated on four simulated location-aggregation datasets, with findings showing that increasing the privacy budget $\epsilon$ improves utility (lower MSE), and Gaussian/Exponential mechanisms generally outperform randomized response for numerical data. The work offers a practical, extensible toolkit for privacy-preserving geographic analysis and may be released as open-source to facilitate adoption and further research.

Abstract

Aggregating statistics over geographical regions is important for many applications, such as analyzing income, election results, and disease spread. However, the sensitive nature of this data necessitates strong privacy protections to safeguard individuals. In this work, we present a unified locational differential privacy (DP) framework to enable private aggregation of various data types, including one-hot encoded, boolean, float, and integer arrays, over geographical regions. Our framework employs local DP mechanisms such as randomized response, the exponential mechanism, and the Gaussian mechanism. We evaluate our approach on four datasets representing significant location data aggregation scenarios. Results demonstrate the utility of our framework in providing formal DP guarantees while enabling geographical data analysis.

Unified Locational Differential Privacy Framework

TL;DR

We address private geographical data analysis by proposing a unified locational differential privacy framework that supports diverse data types (one-hot vectors, booleans, integers, and floats) over geographic regions. The framework uses local DP mechanisms—randomized response, the exponential mechanism, and the Gaussian mechanism—implemented with Diffprivlib and tracked with Opacus, including shuffling for privacy amplification. It is evaluated on four simulated location-aggregation datasets, with findings showing that increasing the privacy budget improves utility (lower MSE), and Gaussian/Exponential mechanisms generally outperform randomized response for numerical data. The work offers a practical, extensible toolkit for privacy-preserving geographic analysis and may be released as open-source to facilitate adoption and further research.

Abstract

Aggregating statistics over geographical regions is important for many applications, such as analyzing income, election results, and disease spread. However, the sensitive nature of this data necessitates strong privacy protections to safeguard individuals. In this work, we present a unified locational differential privacy (DP) framework to enable private aggregation of various data types, including one-hot encoded, boolean, float, and integer arrays, over geographical regions. Our framework employs local DP mechanisms such as randomized response, the exponential mechanism, and the Gaussian mechanism. We evaluate our approach on four datasets representing significant location data aggregation scenarios. Results demonstrate the utility of our framework in providing formal DP guarantees while enabling geographical data analysis.
Paper Structure (22 sections, 1 equation, 7 figures)

This paper contains 22 sections, 1 equation, 7 figures.

Figures (7)

  • Figure 1: We present an abstraction of what our simulated data is describing.
  • Figure 2: Apple's implementation DpApple - Map showing final normalized histograms of top categories for photos taken near the Brooklyn Bridge in New York City.
  • Figure 3: "The secure aggregation protocol that enforces the DP assurance. Each binary vector is split into two shares, and each share is encrypted with a different public key. As long as no single entity gets access to both private keys, nobody can see any individual vectors, only the aggregate, which satisfies the desired DP assurance" DpApple
  • Figure 4: Simulator Front-end: UI of Streamlit app to allow users to select $\epsilon$ value, data type and privacy mechanism to apply to preset map region.
  • Figure 5: Simulator Result: Map showing final normalized histograms of top categories for photos taken near the CMU Campus in Pittsburgh, PA.
  • ...and 2 more figures