Table of Contents
Fetching ...

Privacy Guarantees for Personal Mobility Data in Humanitarian Response

Nitin Kohli, Emily Aiken, Joshua Blumenstock

TL;DR

This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects, and introduces an algorithm for constructing differentially private mobility matrices and derive privacy and accuracy bounds on this algorithm.

Abstract

Personal mobility data from mobile phones and other sensors are increasingly used to inform policymaking during pandemics, natural disasters, and other humanitarian crises. However, even aggregated mobility traces can reveal private information about individual movements to potentially malicious actors. This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects. Specifically, we (1) introduce an algorithm for constructing differentially private mobility matrices, and derive privacy and accuracy bounds on this algorithm; (2) use real-world data from mobile phone operators in Afghanistan and Rwanda to show how this algorithm can enable the use of private mobility data in two high-stakes policy decisions: pandemic response and the distribution of humanitarian aid; and (3) discuss practical decisions that need to be made when implementing this approach, such as how to optimally balance privacy and accuracy. Taken together, these results can help enable the responsible use of private mobility data in humanitarian response.

Privacy Guarantees for Personal Mobility Data in Humanitarian Response

TL;DR

This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects, and introduces an algorithm for constructing differentially private mobility matrices and derive privacy and accuracy bounds on this algorithm.

Abstract

Personal mobility data from mobile phones and other sensors are increasingly used to inform policymaking during pandemics, natural disasters, and other humanitarian crises. However, even aggregated mobility traces can reveal private information about individual movements to potentially malicious actors. This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects. Specifically, we (1) introduce an algorithm for constructing differentially private mobility matrices, and derive privacy and accuracy bounds on this algorithm; (2) use real-world data from mobile phone operators in Afghanistan and Rwanda to show how this algorithm can enable the use of private mobility data in two high-stakes policy decisions: pandemic response and the distribution of humanitarian aid; and (3) discuss practical decisions that need to be made when implementing this approach, such as how to optimally balance privacy and accuracy. Taken together, these results can help enable the responsible use of private mobility data in humanitarian response.
Paper Structure (30 sections, 12 theorems, 10 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 30 sections, 12 theorems, 10 equations, 5 figures, 6 tables, 1 algorithm.

Key Result

Lemma 1

For a statistic $S:\mathbb{D} \rightarrow \mathbb{R}^m$, let $\Delta_S$ be the supremum of $||S(d) - S(d')||_1$, where $d,d'\in \mathbb{D}$ are datasets that differ in one element. For any $\Delta_S \in (0,\infty)$, the algorithm $A(d) = S(d) + (\zeta_1,...,\zeta_m)^T$, where each $\zeta_i$ is drawn

Figures (5)

  • Figure 1: Panel A: A portion of an origin-destination matrix, showing movement between 10 districts of Afghanistan on January 1, 2015, as calculated from data provided by a mobile phone operator. Panel B: The origin-destination matrix is passed into our private O-D matrix algorithm (as described in Algorithm \ref{['algo:od']}) for trip-level protection ($T$ =1) with privacy parameter $\epsilon = 0.1$ and suppression threshold $\tau = 15$. Panel C: The same portion of the origin-destination matrix, this time after being privatized by our private O-D matrix algorithm (with $\epsilon = 0.1$). Panel D: Movement from Kabul to other districts of Afghanistan on January 1, 2015, based on the highlighted row of the private origin-destination matrix in Panel B. Kabul, the origin district, is shown in black.
  • Figure 2: Left: Absolute errors in matrix entries for differentially private O-D matrices, relative to non-private matrices (derived over all days in our Afghanistan 2020 dataset and all origin-destination pairs at the admin-2 or province level). Absolute error is calculated as the difference between an O-D matrix count in the private matrix and the corresponding O-D matrix count in the non-private matrix. The distributions shown are taken over all 305 days in our 2020 CDR dataset (see Table \ref{['table:data']}) and all origin-destination province pairs. Right: Relative errors in matrix entries for differentially private O-D matrices, relative to non-private matrices. Relative error is calculated as twice the absolute error in a private O-D matrix count, divided by sum of the private and non-private O-D matrix counts for the same cell. Again, the distributions shown are taken over all 305 days in our 2020 CDR dataset (see Table \ref{['table:data']}) and all origin-destination province pairs.
  • Figure S1: Epidemic curves based on mobility-based SIR models and O-D matrices derived from call detail records. Top: Pandemic initiating in Kabul. Middle: Pandemic initiating in Hirat. Bottom: Pandemic initiating randomly.
  • Figure S2: Total out-migration from areas used in simulations of humanitarian response to natural disasters and violent events, calculated from call detail records. Top: Out-migration during the Battle of Kunduz in Afghanistan. Bottom: Out-migration following the Lake Kivu Earthquake in Rwanda.
  • Figure S3: Sensitivity of results on accuracy for identifying top-$k$ regions of out migration in simulations of natural disasters and violent events to value of $k$. Left: admin-2 level. Right: admin-3 level. Lines are not present where there are fewer than $k$ regions of out-migration after suppression of small counts.

Theorems & Definitions (20)

  • Definition 1
  • Lemma 1: Laplace Mechanism dwork2006calibrating
  • Lemma 2: Composition dwork2014algorithmic
  • Lemma 3: Group-Privacy dwork2014algorithmic
  • Lemma 4: Post-Processing dwork2014algorithmic
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • ...and 10 more