Table of Contents
Fetching ...

Numerical Estimation of Spatial Distributions under Differential Privacy

Leilei Du, Peng Cheng, Libin Zheng, Xiang Lian, Lei Chen, Wei Xi, Wangze Ni

TL;DR

This work tackles private estimation of spatial distributions under Local Differential Privacy by recognizing the ordinal geometry of 2-D spatial data. It introduces Spatial Area Mechanism (SAM) and its optimal instantiation Disk Area Mechanism (DAM), which project two-dimensional data onto one dimension using Radon transforms and optimize using the sliced Wasserstein distance to preserve spatial relationships. A Hybrid Uniform-Exponential Mechanism (HUEM) provides a baseline SAM with an explicit probability form, while DAM achieves provable optimality among SAMs and is adapted to grids via bucketizing and post-processing. Extensive experiments on real (Chicago crimes, NYC taxis) and synthetic datasets show DAM consistently outperforms state-of-the-art methods (MDSW, SEM-Geo-I) in estimating private spatial distributions, particularly at finer granularity and larger privacy budgets, underscoring the practical value of geometry-aware LDP for geospatial analytics.

Abstract

Estimating spatial distributions is important in data analysis, such as traffic flow forecasting and epidemic prevention. To achieve accurate spatial distribution estimation, the analysis needs to collect sufficient user data. However, collecting data directly from individuals could compromise their privacy. Most previous works focused on private distribution estimation for one-dimensional data, which does not consider spatial data relation and leads to poor accuracy for spatial distribution estimation. In this paper, we address the problem of private spatial distribution estimation, where we collect spatial data from individuals and aim to minimize the distance between the actual distribution and estimated one under Local Differential Privacy (LDP). To leverage the numerical nature of the domain, we project spatial data and its relationships onto a one-dimensional distribution. We then use this projection to estimate the overall spatial distribution. Specifically, we propose a reporting mechanism called Disk Area Mechanism (DAM), which projects the spatial domain onto a line and optimizes the estimation using the sliced Wasserstein distance. Through extensive experiments, we show the effectiveness of our DAM approach on both real and synthetic data sets, compared with the state-of-the-art methods, such as Multi-dimensional Square Wave Mechanism (MDSW) and Subset Exponential Mechanism with Geo-I (SEM-Geo-I). Our results show that our DAM always performs better than MDSW and is better than SEM-Geo-I when the data granularity is fine enough.

Numerical Estimation of Spatial Distributions under Differential Privacy

TL;DR

This work tackles private estimation of spatial distributions under Local Differential Privacy by recognizing the ordinal geometry of 2-D spatial data. It introduces Spatial Area Mechanism (SAM) and its optimal instantiation Disk Area Mechanism (DAM), which project two-dimensional data onto one dimension using Radon transforms and optimize using the sliced Wasserstein distance to preserve spatial relationships. A Hybrid Uniform-Exponential Mechanism (HUEM) provides a baseline SAM with an explicit probability form, while DAM achieves provable optimality among SAMs and is adapted to grids via bucketizing and post-processing. Extensive experiments on real (Chicago crimes, NYC taxis) and synthetic datasets show DAM consistently outperforms state-of-the-art methods (MDSW, SEM-Geo-I) in estimating private spatial distributions, particularly at finer granularity and larger privacy budgets, underscoring the practical value of geometry-aware LDP for geospatial analytics.

Abstract

Estimating spatial distributions is important in data analysis, such as traffic flow forecasting and epidemic prevention. To achieve accurate spatial distribution estimation, the analysis needs to collect sufficient user data. However, collecting data directly from individuals could compromise their privacy. Most previous works focused on private distribution estimation for one-dimensional data, which does not consider spatial data relation and leads to poor accuracy for spatial distribution estimation. In this paper, we address the problem of private spatial distribution estimation, where we collect spatial data from individuals and aim to minimize the distance between the actual distribution and estimated one under Local Differential Privacy (LDP). To leverage the numerical nature of the domain, we project spatial data and its relationships onto a one-dimensional distribution. We then use this projection to estimate the overall spatial distribution. Specifically, we propose a reporting mechanism called Disk Area Mechanism (DAM), which projects the spatial domain onto a line and optimizes the estimation using the sliced Wasserstein distance. Through extensive experiments, we show the effectiveness of our DAM approach on both real and synthetic data sets, compared with the state-of-the-art methods, such as Multi-dimensional Square Wave Mechanism (MDSW) and Subset Exponential Mechanism with Geo-I (SEM-Geo-I). Our results show that our DAM always performs better than MDSW and is better than SEM-Geo-I when the data granularity is fine enough.

Paper Structure

This paper contains 29 sections, 25 equations, 14 figures, 5 tables, 2 algorithms.

Figures (14)

  • Figure 1: shooting victims per 1,000 residents of Chicago in 2021
  • Figure 2: Radon Transform and I/O domain with any real point.
  • Figure 3: Radon transform and sliced Wasserstein distance transform.
  • Figure 4: Non-shrunken/Shrunken areas in grid division.
  • Figure 5: The process of border shrinkage in discrete DAM.
  • ...and 9 more figures

Theorems & Definitions (10)

  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof