Enabling Humanitarian Applications with Targeted Differential Privacy

Nitin Kohli; Joshua Blumenstock

Enabling Humanitarian Applications with Targeted Differential Privacy

Nitin Kohli, Joshua Blumenstock

TL;DR

The paper tackles privacy in algorithmic targeting by introducing Targeted Differential Privacy (TDP), an adaptation of differential privacy that preserves enough information to distinguish between sufficiently different individuals while protecting those who are similar. It proposes a private projection algorithm that maps data to a higher-dimensional space and then privately reprojects it, achieving $(B,\epsilon,\delta)$-TDP and enabling accurate targeting in high-stakes humanitarian settings. Through two real-world case studies in Togo and Nigeria, the work quantifies the privacy-utility tradeoffs, showing that substantial privacy gains can be achieved with relatively small losses in targeting accuracy, and it analyzes protection against singling-out, attribute inference, and distinguishing attacks. The framework provides practical guidance for program designers to configure privacy parameters, facilitating responsible data use in social protection and credit contexts while aligning with legal and ethical privacy standards.

Abstract

The proliferation of mobile phones in low- and middle-income countries has suddenly and dramatically increased the extent to which the world's poorest and most vulnerable populations can be observed and tracked by governments and corporations. Millions of historically "off the grid" individuals are now passively generating digital data; these data, in turn, are being used to make life-altering decisions about those individuals -- including whether or not they receive government benefits, and whether they qualify for a consumer loan. This paper develops an approach to implementing algorithmic decisions based on personal data, while also providing formal privacy guarantees to data subjects. The approach adapts differential privacy to applications that require decisions about individuals, and gives decision makers granular control over the level of privacy guaranteed to data subjects. We show that stronger privacy guarantees typically come at some cost, and use data from two real-world applications -- an anti-poverty program in Togo and a consumer lending platform in Nigeria -- to illustrate those costs. Our empirical results quantify the tradeoff between privacy and predictive accuracy, and characterize how different privacy guarantees impact overall program effectiveness. More broadly, our results demonstrate a way for humanitarian programs to responsibly use personal data, and better equip program designers to make informed decisions about data privacy.

Enabling Humanitarian Applications with Targeted Differential Privacy

TL;DR

-TDP and enabling accurate targeting in high-stakes humanitarian settings. Through two real-world case studies in Togo and Nigeria, the work quantifies the privacy-utility tradeoffs, showing that substantial privacy gains can be achieved with relatively small losses in targeting accuracy, and it analyzes protection against singling-out, attribute inference, and distinguishing attacks. The framework provides practical guidance for program designers to configure privacy parameters, facilitating responsible data use in social protection and credit contexts while aligning with legal and ethical privacy standards.

Abstract

Paper Structure (16 sections, 11 theorems, 27 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 16 sections, 11 theorems, 27 equations, 3 figures, 1 table, 1 algorithm.

Preliminaries.
Related contextual adaptations.
Necessary conditions for accurate targeting.
Singling-out protection.
Attribute inference protection.
Distinguishing protection.
Interpreting privacy protection scores.
Supplementary Note: Targeted Differential Privacy
Formalizing the targeting problem and setting
Targeted differential privacy
Formal analysis of the necessary conditions for accurate targeting
Private projection algorithm details
Supplementary Note: Additional Information on Privacy Measures
Relative protection score using the holdout-approach for attribute inference
Computing the distinguishing protection score for our algorithm
...and 1 more sections

Key Result

Lemma 1

[Post-Processing] Suppose $A$ from $\mathbb{L}_2^{n\times d}$ to $\mathbb{O}$ satisfies $(B,\epsilon, \delta)$-TDP. Then for any (potentially randomized) function $F$ from $\mathbb{O}$ to some space $\mathbb{O}'$, $F \circ A$ also satisfies $(B,\epsilon, \delta)$-TDP.

Figures (3)

Figure 1: Overview of targeting setting (top) and the private projection algorithm (bottom). (A) Personal data are held by a data holder (e.g., mobile network operator). These data are given to (B) an algorithm that generates a version of the data with a provable privacy guarantee. (C) This private version of the data can then be joined with (D) training "labels" from third parties that indicate the true eligibility for a subset of the population. (E) Machine learning models learn how to predict eligibility status for the full population for whom eligibility status is not directly observed, but for whom private data exist. (F) These predictions can then be used in downstream applications. Bottom figures provide a conceptual sketch of how the private projection algorithm works. (G) Each individual's raw data $X$ is projected into $\mathbb{R}^k$ (H), and is then projected back to $\mathbb{R}^d$ using the singular values of $X$ (I). The resulting record corresponds to individual $i$'s record in $X_{priv}$.
Figure 2: The impact of different privacy-enhancing technologies on targeting accuracy in two real-world settings. (A) Anti-poverty program in Togo. In simulations of a nationwide humanitarian program, differential privacy ($\epsilon \approx 4$) and k-anonymity ($k=2$) would increase exclusion errors by $115K$ and $240K$, relative to the non-private status quo. Our approach (targeted differential privacy) increases exclusion errors by $2K$ ($B = 0.25$). (B) Micro-lending platform in Nigeria. Simulating the accuracy of credit scoring algorithms used to extend loans to individuals without a formal financial history, differential privacy ($\epsilon \approx 3$) and $k$-anonymity ($k=2$) would reduce the profits of the program by 430% and 476%, respectively. Targeted differential privacy would reduce profits by 12% ($B = 0.1$).
Figure 3: Empirical tradeoffs between privacy and program effectiveness for a humanitarian program in Togo (left panels) and a digital lending platform in Nigeria (right panels). Privacy protections are shown for singling-out attacks (top row), attribute inference attacks (middle row), and distinguishing attacks (bottom row). For all panels, blue stars represents the non-private status quo, and red stars represents classic $(\epsilon, \delta)$-differential privacy. Green stars corresponds to targeted differential privacy, with $B = 0.25$ and $\epsilon\approx 4$ for the Togolese program and $B = 0.1$ and $\epsilon \approx 3$ for the Nigerian application.

Theorems & Definitions (23)

Definition 1
Definition 2
Example 1
Definition 3
Definition 4
Lemma 1
Lemma 2
Lemma 3
Lemma 4
proof
...and 13 more

Enabling Humanitarian Applications with Targeted Differential Privacy

TL;DR

Abstract

Enabling Humanitarian Applications with Targeted Differential Privacy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (23)